[jira] [Commented] (YARN-6819) Application report fails if app rejected due to nodesize
[ https://issues.apache.org/jira/browse/YARN-6819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092610#comment-16092610 ] Sunil G commented on YARN-6819: --- +1 > Application report fails if app rejected due to nodesize > > > Key: YARN-6819 > URL: https://issues.apache.org/jira/browse/YARN-6819 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Attachments: YARN-6819.001.patch, YARN-6819.002.patch, > YARN-6819.003.patch, YARN-6819.004.patch > > > In YARN-5006 application rejected when nodesize limit is exceeded. > {{FinalSavingTransition}} stateBeforeFinalSaving not set after skipping save > to store which causes application report failure -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6819) Application report fails if app rejected due to nodesize
[ https://issues.apache.org/jira/browse/YARN-6819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092601#comment-16092601 ] Rohith Sharma K S commented on YARN-6819: - +1 lgtm, committing shortly > Application report fails if app rejected due to nodesize > > > Key: YARN-6819 > URL: https://issues.apache.org/jira/browse/YARN-6819 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Attachments: YARN-6819.001.patch, YARN-6819.002.patch, > YARN-6819.003.patch, YARN-6819.004.patch > > > In YARN-5006 application rejected when nodesize limit is exceeded. > {{FinalSavingTransition}} stateBeforeFinalSaving not set after skipping save > to store which causes application report failure -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6771) Use classloader inside configuration class to make new classes
[ https://issues.apache.org/jira/browse/YARN-6771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092568#comment-16092568 ] Jongyoul Lee commented on YARN-6771: Thanks for the reply. I'm developing cluster manager feature in Apache Zeppelin. I've tried to use Yarn Client to communicate with Yarn cluster. Avoiding conflicts of dependencies, I wanted to use custom classloader for initializing yarn client in the Zeppelin's server process, but it failed because this implementation uses system classloader even I set the custom classloader in the Configuration. I changed my implementation to include hadoop libs into process directly but I think we need to change it to work correctly. > Use classloader inside configuration class to make new classes > --- > > Key: YARN-6771 > URL: https://issues.apache.org/jira/browse/YARN-6771 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.8.1, 3.0.0-alpha4 >Reporter: Jongyoul Lee > Fix For: 2.8.2 > > Attachments: YARN-6771-1.patch, YARN-6771-2.patch, YARN-6771.patch > > > While running {{RpcClientFactoryPBImpl.getClient}}, > {{RpcClientFactoryPBImpl}} uses {{localConf.getClassByName}}. But in case of > using custom classloader, we have to use {{conf.getClassByName}} because > custom classloader is already stored in {{Configuration}} class. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6777) Support for ApplicationMasterService processing chain of interceptors
[ https://issues.apache.org/jira/browse/YARN-6777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092507#comment-16092507 ] Hadoop QA commented on YARN-6777: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 39s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 26s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 24s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 32s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 23s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 42m 42s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 31s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 98m 41s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.fair.TestFSAppStarvation | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | YARN-6777 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12877900/YARN-6777.006.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml | | uname | Linux 1ad0b3188620 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / daaf530 | | Default Java | 1.8.0_131 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/16486/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test
[jira] [Commented] (YARN-5146) [YARN-3368] Supports Fair Scheduler in new YARN UI
[ https://issues.apache.org/jira/browse/YARN-5146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092425#comment-16092425 ] Abdullah Yousufi commented on YARN-5146: Hey [~sunilg], I'm still unable to reproduce this issue. Is there something you do in the UI that triggers the error? Also, I'm unsure if it has to do with the existing adapters, because the url is trying to access yarn-queue.yarn-queues, but there is not supposed to be an adapter or any file by that name (specifically a yarn-queues.js in a yarn-queue directory). If you want, I can create a ec2 instance with this patch to see if you can replicate this issue on there. Or let me know if you have any other thoughts on how I should debug this. Thanks. > [YARN-3368] Supports Fair Scheduler in new YARN UI > -- > > Key: YARN-5146 > URL: https://issues.apache.org/jira/browse/YARN-5146 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Abdullah Yousufi > Attachments: YARN-5146.001.patch, YARN-5146.002.patch, > YARN-5146.003.patch > > > Current implementation in branch YARN-3368 only support capacity scheduler, > we want to make it support fair scheduler. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6840) Leverage RMStateStore to store scheduler configuration updates
[ https://issues.apache.org/jira/browse/YARN-6840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092405#comment-16092405 ] Daniel Templeton commented on YARN-6840: Are you sure that's a good idea? We're already abusing ZK pretty badly with the amount of junk we cram in there as is. Zookeeper is pretty clear in the docs that it should not be used as a general purpose data store. > Leverage RMStateStore to store scheduler configuration updates > -- > > Key: YARN-6840 > URL: https://issues.apache.org/jira/browse/YARN-6840 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan > > With this change, user doesn't have to setup separate storage system (like > LevelDB) to store updates of scheduler configs. And dynamic queue can be used > when RM HA is enabled. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6777) Support for ApplicationMasterService processing chain of interceptors
[ https://issues.apache.org/jira/browse/YARN-6777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092392#comment-16092392 ] Hadoop QA commented on YARN-6777: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 46s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 53s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 30s{color} | {color:red} hadoop-yarn-api in the patch failed. {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 28s{color} | {color:red} hadoop-yarn-common in the patch failed. {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 19s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 28s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 27s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 26s{color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager generated 9 new + 859 unchanged - 0 fixed = 868 total (was 859) {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 39s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 34s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 32s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 33s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 58m 23s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | YARN-6777 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12877900/YARN-6777.006.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml | | uname | Linux f956de57a510 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | |
[jira] [Commented] (YARN-6778) In ResourceWeights, weights and setWeights() should be final
[ https://issues.apache.org/jira/browse/YARN-6778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092343#comment-16092343 ] Yufei Gu commented on YARN-6778: Committed to trunk and branch-2. Thanks [~templedf]. > In ResourceWeights, weights and setWeights() should be final > > > Key: YARN-6778 > URL: https://issues.apache.org/jira/browse/YARN-6778 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Affects Versions: 2.8.1, 3.0.0-alpha4 >Reporter: Daniel Templeton >Assignee: Daniel Templeton >Priority: Minor > Labels: newbie > Fix For: 3.0.0-beta1, 2.9 > > Attachments: YARN-6778.001.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6788) Improve performance of resource profile branch
[ https://issues.apache.org/jira/browse/YARN-6788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092337#comment-16092337 ] Daniel Templeton commented on YARN-6788: Here's my first-pass comments: * In {{Resource.equals()}} instead of the nested _for_ loops: {code}for (ResourceInformation entry : getResources()) { for (ResourceInformation otherEntry : other.getResources()) { if (entry.getName().equals(ResourceInformation.MEMORY_MB.getName()) || entry.getName().equals(ResourceInformation.VCORES.getName())) { continue; } if (entry.getName().equals(otherEntry.getName())) { if (!entry.equals(otherEntry)) { return false; } } } }{code} would it be better to grab the resource index map and iterate through that instead? If you do that, you can also skip the special casing of the memory and CPU tests. * In {{Resource.compareTo()}}, this: {code}ResourceInformation[] thisResources, otherResources; thisResources = this.getResources(); otherResources = other.getResources();{code} should be: {code} ResourceInformation[] thisResources = this.getResources(); ResourceInformation[] otherResources = other.getResources();{code} * In {{Resource.compareTo()}}, array length is an int, so {{diff}} should be an int. * In {{Resource.compareTo()}} we assume that if the number of resource types is the same, then they're equal. Is that sound? It doesn't seem like that will produce a consistent sort order. I have the same concern iterating through rest of the resources. Seems like it should instead be iterating through the resource type index map. * {{ResourcePBImpl.getResources()}} should call {{super.getResources()}} instead of reimplementing the logic. * {{ResourcePBImpl.getResourceValue()}} should call {{super.getResourceValue()}} instead of reimplementing the logic. * {{ResourcePBImpl.getResourceInformation()}} should call {{super.getResourceInformation()}} instead of reimplementing the logic. * In {{ResourcePBImpl.mergeLocalToBuilder()}} the _for_ loop should be a _for each_. * Seems like you should add a {{ResourceUtils.getResourceType(String resource)}} method to simplify the code. * In the {{BaseResource()}} constructor, I don't see a reason to special case the memory and CPU. Just handle them in the loop with the other resources. * {{ResourceUtils.indexForResourceInformation}} should be final. * {{ResourceUtils.getResourceTypesArray()}} should return {{readOnlyResourcesArray}} instead of recreating it. * {{ResourceUtils.getResourceTypesMinimumAllocation()}} and {{ResourceUtils.getResourceTypesMaximumAllocation()}} should use {{readOnlyResourcesArray}} instead of calling getResourceTypesArray(). * Unrelated to this patch, but {{ResourceUtils.getResourceTypesMinimumAllocation()}} and {{ResourceUtils.getResourceTypesMaximumAllocation()}} would be a lot clearer with an _else_ rather than the _continue_ statements. * Why not convert {{Resources.FixedValueResource}} to extend {{BaseResource}}? * In {{TestResourceUtils.testGetResourceInformation()}}, it seems like we should be able to compare the resource arrays since the order is now fixed instead of having to compare the maps element by element. > Improve performance of resource profile branch > -- > > Key: YARN-6788 > URL: https://issues.apache.org/jira/browse/YARN-6788 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Sunil G >Assignee: Sunil G >Priority: Blocker > Attachments: YARN-6788-YARN-3926.001.patch, > YARN-6788-YARN-3926.002.patch, YARN-6788-YARN-3926.003.patch, > YARN-6788-YARN-3926.004.patch, YARN-6788-YARN-3926.005.patch, > YARN-6788-YARN-3926.006.patch > > > Currently we could see a 15% performance delta with this branch. > Few performance improvements to improve the same. > Also this patch will handle > [comments|https://issues.apache.org/jira/browse/YARN-6761?focusedCommentId=16075418=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16075418] > from [~leftnoteasy]. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6777) Support for ApplicationMasterService processing chain of interceptors
[ https://issues.apache.org/jira/browse/YARN-6777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-6777: -- Attachment: YARN-6777.006.patch Updating javadocs. > Support for ApplicationMasterService processing chain of interceptors > - > > Key: YARN-6777 > URL: https://issues.apache.org/jira/browse/YARN-6777 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: YARN-6777.001.patch, YARN-6777.002.patch, > YARN-6777.003.patch, YARN-6777.004.patch, YARN-6777.005.patch, > YARN-6777.006.patch > > > This JIRA extends the Processor introduced in YARN-6776 with a configurable > processing chain of interceptors. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6778) In ResourceWeights, weights and setWeights() should be final
[ https://issues.apache.org/jira/browse/YARN-6778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092312#comment-16092312 ] Yufei Gu commented on YARN-6778: Indeed. We should put this into style checking. +1 for this patch. > In ResourceWeights, weights and setWeights() should be final > > > Key: YARN-6778 > URL: https://issues.apache.org/jira/browse/YARN-6778 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Affects Versions: 2.8.1, 3.0.0-alpha4 >Reporter: Daniel Templeton >Assignee: Daniel Templeton >Priority: Minor > Labels: newbie > Attachments: YARN-6778.001.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6777) Support for ApplicationMasterService processing chain of interceptors
[ https://issues.apache.org/jira/browse/YARN-6777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092309#comment-16092309 ] Arun Suresh commented on YARN-6777: --- bq. is it OK to propagate the exception in getProcessorList.. Think it should be fine.. it is called serviceInit() method which is allowed to throw any arbitrary Exception. Will fix the comments.. thanks.. > Support for ApplicationMasterService processing chain of interceptors > - > > Key: YARN-6777 > URL: https://issues.apache.org/jira/browse/YARN-6777 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: YARN-6777.001.patch, YARN-6777.002.patch, > YARN-6777.003.patch, YARN-6777.004.patch, YARN-6777.005.patch > > > This JIRA extends the Processor introduced in YARN-6776 with a configurable > processing chain of interceptors. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6798) Fix NM startup failure with old state store due to version mismatch
[ https://issues.apache.org/jira/browse/YARN-6798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092308#comment-16092308 ] Botong Huang commented on YARN-6798: Thanks [~rchiang]! > Fix NM startup failure with old state store due to version mismatch > --- > > Key: YARN-6798 > URL: https://issues.apache.org/jira/browse/YARN-6798 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0-alpha4 >Reporter: Ray Chiang >Assignee: Botong Huang > Fix For: 3.0.0-beta1 > > Attachments: YARN-6798.v1.patch, YARN-6798.v2.patch > > > YARN-6703 rolled back the state store version number for the RM from 2.0 to > 1.4. > YARN-6127 bumped the version for the NM to 3.0 > private static final Version CURRENT_VERSION_INFO = > Version.newInstance(3, 0); > YARN-5049 bumped the version for the NM to 2.0 > private static final Version CURRENT_VERSION_INFO = > Version.newInstance(2, 0); > During an upgrade, all NMs died after upgrading a C6 cluster from alpha2 to > alpha4. > {noformat} > 2017-07-07 15:48:17,259 FATAL > org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting > NodeManager > org.apache.hadoop.service.ServiceStateException: java.io.IOException: > Incompatible version for NM state: expecting NM state version 3.0, but > loading version 2.0 > at > org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:172) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartRecoveryStore(NodeManager.java:246) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:307) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:748) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:809) > Caused by: java.io.IOException: Incompatible version for NM state: expecting > NM state version 3.0, but loading version 2.0 > at > org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.checkVersion(NMLeveldbStateStoreService.java:1454) > at > org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.initStorage(NMLeveldbStateStoreService.java:1308) > at > org.apache.hadoop.yarn.server.nodemanager.recovery.NMStateStoreService.serviceInit(NMStateStoreService.java:307) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > ... 5 more > 2017-07-07 15:48:17,277 INFO > org.apache.hadoop.yarn.server.nodemanager.NodeManager: SHUTDOWN_MSG: > / > SHUTDOWN_MSG: Shutting down NodeManager at xxx.gce.cloudera.com/aa.bb.cc.dd > / > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6675) Add NM support to launch opportunistic containers based on overallocation
[ https://issues.apache.org/jira/browse/YARN-6675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-6675: - Summary: Add NM support to launch opportunistic containers based on overallocation (was: Add NM support to launch opportunistic containers based on oversubscription) > Add NM support to launch opportunistic containers based on overallocation > - > > Key: YARN-6675 > URL: https://issues.apache.org/jira/browse/YARN-6675 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.0.0-alpha3 >Reporter: Haibo Chen >Assignee: Haibo Chen > Attachments: YARN-6675.00.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6796) Add unit test for NM to launch OPPORTUNISTIC container for overallocation
[ https://issues.apache.org/jira/browse/YARN-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-6796: - Summary: Add unit test for NM to launch OPPORTUNISTIC container for overallocation (was: Add unit test for NM to launch OPPORTUNISTIC container for oversubscription) > Add unit test for NM to launch OPPORTUNISTIC container for overallocation > - > > Key: YARN-6796 > URL: https://issues.apache.org/jira/browse/YARN-6796 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Reporter: Haibo Chen >Assignee: Haibo Chen > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6778) In ResourceWeights, weights and setWeights() should be final
[ https://issues.apache.org/jira/browse/YARN-6778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092289#comment-16092289 ] Daniel Templeton commented on YARN-6778: It should be final because it is invoked in the constructor. Overridable methods called from a constructor can cause issues. See https://stackoverflow.com/questions/3404301/whats-wrong-with-overridable-method-calls-in-constructors. > In ResourceWeights, weights and setWeights() should be final > > > Key: YARN-6778 > URL: https://issues.apache.org/jira/browse/YARN-6778 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Affects Versions: 2.8.1, 3.0.0-alpha4 >Reporter: Daniel Templeton >Assignee: Daniel Templeton >Priority: Minor > Labels: newbie > Attachments: YARN-6778.001.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6777) Support for ApplicationMasterService processing chain of interceptors
[ https://issues.apache.org/jira/browse/YARN-6777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092285#comment-16092285 ] Subru Krishnan commented on YARN-6777: -- Thanks [~asuresh] for updating the patch, this LGTM. I have one question - is it OK to propagate the exception in *getProcessorList* out of {{ApplicationMasterService}}? I have a minor nit on code comments: * Call out in {{DefaultAMSProcessor}} that it *must* be the last interceptor in the chain. * In {{ApplicationMasterService}}, where it ensures the above. * In {{OpportunisticContainerAllocatorAMService}}, why next interceptor will never be null due to above. > Support for ApplicationMasterService processing chain of interceptors > - > > Key: YARN-6777 > URL: https://issues.apache.org/jira/browse/YARN-6777 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: YARN-6777.001.patch, YARN-6777.002.patch, > YARN-6777.003.patch, YARN-6777.004.patch, YARN-6777.005.patch > > > This JIRA extends the Processor introduced in YARN-6776 with a configurable > processing chain of interceptors. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6628) Unexpected jackson-core-2.2.3 dependency introduced
[ https://issues.apache.org/jira/browse/YARN-6628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092274#comment-16092274 ] Jason Lowe commented on YARN-6628: -- Thanks for the patch! Shading is far from my first choice, but it seems like we have little other choice unless we can remove the fst dependency completely since we're stuck between a mandatory license upgrade and a mandatory jackson-2 dependency. There are some "com.yahoo" prefixes in the shading directives that need to be updated. Also have you verified that shading didn't pick up any undesired settings for META-INF or other items in the jar that could be problematic? > Unexpected jackson-core-2.2.3 dependency introduced > --- > > Key: YARN-6628 > URL: https://issues.apache.org/jira/browse/YARN-6628 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineserver >Affects Versions: 2.8.1 >Reporter: Jason Lowe >Assignee: Jonathan Eagles >Priority: Blocker > Attachments: YARN-6628.1.patch, YARN-6628.2-branch-2.8.patch, > YARN-6628.3-branch-2.8.patch > > > The change in YARN-5894 caused jackson-core-2.2.3.jar to be added in > share/hadoop/yarn/lib/. This added dependency seems to be incompatible with > jackson-core-asl-1.9.13.jar which is also shipped as a dependency. This new > jackson-core jar ends up breaking jobs that ran fine on 2.8.0. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-6610) DominantResourceCalculator.getResourceAsValue() dominant param is no longer appropriate
[ https://issues.apache.org/jira/browse/YARN-6610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092210#comment-16092210 ] Yufei Gu edited comment on YARN-6610 at 7/18/17 10:00 PM: -- Hi [~templedf], the patch doesn't apply to trunk. Can you rebase it? Seems like it can apply for branch YARN-3926. Just need to change the patch file name. was (Author: yufeigu): Hi [~templedf], the patch doesn't apply to trunk. Can you rebase it? > DominantResourceCalculator.getResourceAsValue() dominant param is no longer > appropriate > --- > > Key: YARN-6610 > URL: https://issues.apache.org/jira/browse/YARN-6610 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: YARN-3926 >Reporter: Daniel Templeton >Assignee: Daniel Templeton >Priority: Critical > Attachments: YARN-6610.001.patch > > > The {{dominant}} param assumes there are only two resources, i.e. true means > to compare the dominant, and false means to compare the subordinate. Now > that there are _n_ resources, this parameter no longer makes sense. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6130) [ATSv2 Security] Generate a delegation token for AM when app collector is created and pass it to AM via NM and RM
[ https://issues.apache.org/jira/browse/YARN-6130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092244#comment-16092244 ] Hadoop QA commented on YARN-6130: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 9 new or modified test files. {color} | || || || || {color:brown} YARN-5355 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 38s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 11s{color} | {color:green} YARN-5355 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 6s{color} | {color:green} YARN-5355 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 13s{color} | {color:green} YARN-5355 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 5m 11s{color} | {color:green} YARN-5355 passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 16s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common in YARN-5355 has 2 extant Findbugs warnings. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 58s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager in YARN-5355 has 5 extant Findbugs warnings. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 15s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager in YARN-5355 has 8 extant Findbugs warnings. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 43s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client in YARN-5355 has 2 extant Findbugs warnings. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 49s{color} | {color:red} hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app in YARN-5355 has 3 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 50s{color} | {color:green} YARN-5355 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 13m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 13m 22s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 4s{color} | {color:orange} root: The patch generated 3 new + 393 unchanged - 2 fixed = 396 total (was 395) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 5m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 8m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 15s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 39s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color}
[jira] [Commented] (YARN-6778) In ResourceWeights, weights and setWeights() should be final
[ https://issues.apache.org/jira/browse/YARN-6778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092236#comment-16092236 ] Yufei Gu commented on YARN-6778: Thanks [~templedf] for working on this. Method setWeights() is invoked not only in constructor, why we need it to be final? > In ResourceWeights, weights and setWeights() should be final > > > Key: YARN-6778 > URL: https://issues.apache.org/jira/browse/YARN-6778 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Affects Versions: 2.8.1, 3.0.0-alpha4 >Reporter: Daniel Templeton >Assignee: Daniel Templeton >Priority: Minor > Labels: newbie > Attachments: YARN-6778.001.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6610) DominantResourceCalculator.getResourceAsValue() dominant param is no longer appropriate
[ https://issues.apache.org/jira/browse/YARN-6610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092210#comment-16092210 ] Yufei Gu commented on YARN-6610: Hi [~templedf], the patch doesn't apply to trunk. Can you rebase it? > DominantResourceCalculator.getResourceAsValue() dominant param is no longer > appropriate > --- > > Key: YARN-6610 > URL: https://issues.apache.org/jira/browse/YARN-6610 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: YARN-3926 >Reporter: Daniel Templeton >Assignee: Daniel Templeton >Priority: Critical > Attachments: YARN-6610.001.patch > > > The {{dominant}} param assumes there are only two resources, i.e. true means > to compare the dominant, and false means to compare the subordinate. Now > that there are _n_ resources, this parameter no longer makes sense. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6763) TestProcfsBasedProcessTree#testProcessTree fails in trunk
[ https://issues.apache.org/jira/browse/YARN-6763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092164#comment-16092164 ] Bibin A Chundatt commented on YARN-6763: {quote} what is the OS environment you're running in {quote} root@bibinpc:~# cat /etc/lsb-release DISTRIB_ID=Ubuntu DISTRIB_RELEASE=16.10 DISTRIB_CODENAME=yakkety DISTRIB_DESCRIPTION="Ubuntu 16.10" > TestProcfsBasedProcessTree#testProcessTree fails in trunk > - > > Key: YARN-6763 > URL: https://issues.apache.org/jira/browse/YARN-6763 > Project: Hadoop YARN > Issue Type: Test >Reporter: Bibin A Chundatt >Assignee: Nathan Roberts >Priority: Minor > > {code} > Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.949 sec <<< > FAILURE! - in org.apache.hadoop.yarn.util.TestProcfsBasedProcessTree > testProcessTree(org.apache.hadoop.yarn.util.TestProcfsBasedProcessTree) Time > elapsed: 7.119 sec <<< FAILURE! > java.lang.AssertionError: Child process owned by init escaped process tree. > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.assertTrue(Assert.java:41) > at > org.apache.hadoop.yarn.util.TestProcfsBasedProcessTree.testProcessTree(TestProcfsBasedProcessTree.java:184) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6775) CapacityScheduler: Improvements to assignContainers, avoid unnecessary canAssignToUser/Queue calls
[ https://issues.apache.org/jira/browse/YARN-6775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092146#comment-16092146 ] Hadoop QA commented on YARN-6775: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} branch-2 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 32s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s{color} | {color:green} branch-2 passed with JDK v1.8.0_131 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s{color} | {color:green} branch-2 passed with JDK v1.7.0_131 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 32s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 39s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 9s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s{color} | {color:green} branch-2 passed with JDK v1.8.0_131 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} branch-2 passed with JDK v1.7.0_131 {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s{color} | {color:green} the patch passed with JDK v1.8.0_131 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s{color} | {color:green} the patch passed with JDK v1.7.0_131 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 27s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 12 new + 632 unchanged - 0 fixed = 644 total (was 632) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s{color} | {color:green} the patch passed with JDK v1.8.0_131 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s{color} | {color:green} the patch passed with JDK v1.7.0_131 {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 46m 26s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_131. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}108m 44s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.7.0_131 Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler | | | hadoop.yarn.server.resourcemanager.TestRMRestart | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:5e40efe | | JIRA Issue | YARN-6775 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12877850/YARN-6775.branch-2.003.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 4020fa1c57b7 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Updated] (YARN-6130) [ATSv2 Security] Generate a delegation token for AM when app collector is created and pass it to AM via NM and RM
[ https://issues.apache.org/jira/browse/YARN-6130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-6130: --- Attachment: (was: YARN-6130-YARN-5355.04.patch) > [ATSv2 Security] Generate a delegation token for AM when app collector is > created and pass it to AM via NM and RM > - > > Key: YARN-6130 > URL: https://issues.apache.org/jira/browse/YARN-6130 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-5355-merge-blocker > Attachments: YARN-6130-YARN-5355.01.patch, > YARN-6130-YARN-5355.02.patch, YARN-6130-YARN-5355.03.patch, > YARN-6130-YARN-5355.04.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6628) Unexpected jackson-core-2.2.3 dependency introduced
[ https://issues.apache.org/jira/browse/YARN-6628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092076#comment-16092076 ] Jonathan Eagles commented on YARN-6628: --- [~jlowe], this patch YARN-6628.3-branch-2.8.patch works for both branch-2 and branch-2.8. For branch 2.8, this patch totally removes (by shading) jackson 2 and fst from user exposure. For branch 2, there are new uses of jackson 2 in hdfs. I want to unblock the 2.8.2 release, but I want to make sure jackson 2 is remove from HDFS as well. > Unexpected jackson-core-2.2.3 dependency introduced > --- > > Key: YARN-6628 > URL: https://issues.apache.org/jira/browse/YARN-6628 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineserver >Affects Versions: 2.8.1 >Reporter: Jason Lowe >Assignee: Jonathan Eagles >Priority: Blocker > Attachments: YARN-6628.1.patch, YARN-6628.2-branch-2.8.patch, > YARN-6628.3-branch-2.8.patch > > > The change in YARN-5894 caused jackson-core-2.2.3.jar to be added in > share/hadoop/yarn/lib/. This added dependency seems to be incompatible with > jackson-core-asl-1.9.13.jar which is also shipped as a dependency. This new > jackson-core jar ends up breaking jobs that ran fine on 2.8.0. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5049) Extend NMStateStore to save queued container information
[ https://issues.apache.org/jira/browse/YARN-5049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Chiang updated YARN-5049: - Release Note: This breaks rolling upgrades because it changes the major version of the NM state store schema. Therefore when a new NM comes up on an old state store it crashes. The state store versions for this change have been updated in YARN-6798. was:This breaks rolling upgrades because it changes the major version of the NM state store schema. Therefore when a new NM comes up on an old state store it crashes. > Extend NMStateStore to save queued container information > > > Key: YARN-5049 > URL: https://issues.apache.org/jira/browse/YARN-5049 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Konstantinos Karanasos >Assignee: Konstantinos Karanasos > Fix For: 2.9.0, 3.0.0-alpha1 > > Attachments: YARN-5049.001.patch, YARN-5049.002.patch, > YARN-5049.003.patch > > > This JIRA is about extending the NMStateStore to save queued container > information whenever a new container is added to the NM queue. > It also removes the information from the state store when the queued > container starts its execution. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6127) Add support for work preserving NM restart when AMRMProxy is enabled
[ https://issues.apache.org/jira/browse/YARN-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Chiang updated YARN-6127: - Hadoop Flags: Incompatible change Release Note: This breaks rolling upgrades because it changes the major version of the NM state store schema. Therefore when a new NM comes up on an old state store it crashes. The state store versions for this change have been updated in YARN-6798. > Add support for work preserving NM restart when AMRMProxy is enabled > > > Key: YARN-6127 > URL: https://issues.apache.org/jira/browse/YARN-6127 > Project: Hadoop YARN > Issue Type: Sub-task > Components: amrmproxy, nodemanager >Reporter: Subru Krishnan >Assignee: Botong Huang > Fix For: 2.9.0, 3.0.0-alpha4 > > Attachments: YARN-6127-branch-2.v1.patch, YARN-6127.v1.patch, > YARN-6127.v2.patch, YARN-6127.v3.patch, YARN-6127.v4.patch > > > YARN-1336 added the ability to restart NM without loosing any running > containers. In a Federated YARN environment, there's additional state in the > {{AMRMProxy}} to allow for spanning across multiple sub-clusters, so we need > to enhance {{AMRMProxy}} to support work-preserving restart. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6798) Fix NM startup failure with old state store due to version mismatch
[ https://issues.apache.org/jira/browse/YARN-6798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Chiang updated YARN-6798: - Release Note: This fixes the LevelDB state store for the NodeManager. As of this patch, the state store versions now correspond to the following table. - Previous Patch: YARN-5049 -- LevelDB Key: queued -- Hadoop Versions: 2.9.0, 3.0.0-alpha1 -- Corresponding LevelDB Version: 1.2 - Previous Patch: YARN-6127 -- LevelDB Key: AMRMProxy/NextMasterKey -- Hadoop Versions: 2.9.0, 3.0.0-alpha4 -- Corresponding LevelDB Version: 1.1 was: This fixes the LevelDB state store for the NodeManager. As of this patch, the state store versions now correspond to the following table. || Patch || LevelDBKey(s) || Hadoop Versions || NM LevelDB Version || | YARN-5049 | queued | (2.9.0, 3.0.0-alpha1) | 1.2 | | YARN-6127 | AMRMProxy/NextMasterKey | (2.9.0, 3.0.0-alpha4) | 1.1 | > Fix NM startup failure with old state store due to version mismatch > --- > > Key: YARN-6798 > URL: https://issues.apache.org/jira/browse/YARN-6798 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0-alpha4 >Reporter: Ray Chiang >Assignee: Botong Huang > Fix For: 3.0.0-beta1 > > Attachments: YARN-6798.v1.patch, YARN-6798.v2.patch > > > YARN-6703 rolled back the state store version number for the RM from 2.0 to > 1.4. > YARN-6127 bumped the version for the NM to 3.0 > private static final Version CURRENT_VERSION_INFO = > Version.newInstance(3, 0); > YARN-5049 bumped the version for the NM to 2.0 > private static final Version CURRENT_VERSION_INFO = > Version.newInstance(2, 0); > During an upgrade, all NMs died after upgrading a C6 cluster from alpha2 to > alpha4. > {noformat} > 2017-07-07 15:48:17,259 FATAL > org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting > NodeManager > org.apache.hadoop.service.ServiceStateException: java.io.IOException: > Incompatible version for NM state: expecting NM state version 3.0, but > loading version 2.0 > at > org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:172) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartRecoveryStore(NodeManager.java:246) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:307) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:748) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:809) > Caused by: java.io.IOException: Incompatible version for NM state: expecting > NM state version 3.0, but loading version 2.0 > at > org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.checkVersion(NMLeveldbStateStoreService.java:1454) > at > org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.initStorage(NMLeveldbStateStoreService.java:1308) > at > org.apache.hadoop.yarn.server.nodemanager.recovery.NMStateStoreService.serviceInit(NMStateStoreService.java:307) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > ... 5 more > 2017-07-07 15:48:17,277 INFO > org.apache.hadoop.yarn.server.nodemanager.NodeManager: SHUTDOWN_MSG: > / > SHUTDOWN_MSG: Shutting down NodeManager at xxx.gce.cloudera.com/aa.bb.cc.dd > / > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6830) Support quoted strings for environment variables
[ https://issues.apache.org/jira/browse/YARN-6830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092065#comment-16092065 ] Daniel Templeton commented on YARN-6830: Can't you solve this with just a back-reference? {code} private static final Pattern VARVAL_SPLITTER = Pattern.compile( "(?<=^|,)"// preceded by ',' or line begin + '(' + Shell.ENV_NAME_REGEX + ')' // var group + '=' + "(([\"']?)[^,]*\\4)" // val group with optional quotes );{code} > Support quoted strings for environment variables > > > Key: YARN-6830 > URL: https://issues.apache.org/jira/browse/YARN-6830 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Shane Kumpf >Assignee: Shane Kumpf > Attachments: YARN-6830.001.patch > > > There are cases where it is necessary to allow for quoted string literals > within environment variables values when passed via the yarn command line > interface. > For example, consider the follow environment variables for a MR map task. > {{MODE=bar}} > {{IMAGE_NAME=foo}} > {{MOUNTS=/tmp/foo,/tmp/bar}} > When running the MR job, these environment variables are supplied as a comma > delimited string. > {{-Dmapreduce.map.env="MODE=bar,IMAGE_NAME=foo,MOUNTS=/tmp/foo,/tmp/bar"}} > In this case, {{MOUNTS}} will be parsed and added to the task environment as > {{MOUNTS=/tmp/foo}}. Any attempts to quote the embedded comma separated value > results in quote characters becoming part of the value, and parsing still > breaks down at the comma. > This issue is to allow for quoting the comma separated value (escaped double > or single quote). This was mentioned on YARN-4595 and will impact YARN-5534 > as well. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6830) Support quoted strings for environment variables
[ https://issues.apache.org/jira/browse/YARN-6830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092062#comment-16092062 ] Shane Kumpf commented on YARN-6830: --- I've resubmitted the same patch as the unit test failure appeared unrelated and I wasn't seeing the failure locally. Looks better now. > Support quoted strings for environment variables > > > Key: YARN-6830 > URL: https://issues.apache.org/jira/browse/YARN-6830 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Shane Kumpf >Assignee: Shane Kumpf > Attachments: YARN-6830.001.patch > > > There are cases where it is necessary to allow for quoted string literals > within environment variables values when passed via the yarn command line > interface. > For example, consider the follow environment variables for a MR map task. > {{MODE=bar}} > {{IMAGE_NAME=foo}} > {{MOUNTS=/tmp/foo,/tmp/bar}} > When running the MR job, these environment variables are supplied as a comma > delimited string. > {{-Dmapreduce.map.env="MODE=bar,IMAGE_NAME=foo,MOUNTS=/tmp/foo,/tmp/bar"}} > In this case, {{MOUNTS}} will be parsed and added to the task environment as > {{MOUNTS=/tmp/foo}}. Any attempts to quote the embedded comma separated value > results in quote characters becoming part of the value, and parsing still > breaks down at the comma. > This issue is to allow for quoting the comma separated value (escaped double > or single quote). This was mentioned on YARN-4595 and will impact YARN-5534 > as well. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6830) Support quoted strings for environment variables
[ https://issues.apache.org/jira/browse/YARN-6830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092056#comment-16092056 ] Hadoop QA commented on YARN-6830: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s{color} | {color:green} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common: The patch generated 0 new + 5 unchanged - 2 fixed = 5 total (was 7) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 23s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 26m 19s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | YARN-6830 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12877856/YARN-6830.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 09da07eb6469 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 0b7afc0 | | Default Java | 1.8.0_131 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/16483/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/16483/console | | Powered by | Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Support quoted strings for environment variables > > > Key: YARN-6830 > URL: https://issues.apache.org/jira/browse/YARN-6830 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Shane Kumpf >Assignee: Shane Kumpf > Attachments: YARN-6830.001.patch > > > There are cases where it is necessary to allow for quoted string literals > within environment variables values when passed via the yarn command line > interface. >
[jira] [Commented] (YARN-6788) Improve performance of resource profile branch
[ https://issues.apache.org/jira/browse/YARN-6788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092044#comment-16092044 ] Hadoop QA commented on YARN-6788: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} YARN-3926 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 46s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 17s{color} | {color:green} YARN-3926 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 42s{color} | {color:green} YARN-3926 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 53s{color} | {color:green} YARN-3926 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 56s{color} | {color:green} YARN-3926 passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 10s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api in YARN-3926 has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 28s{color} | {color:green} YARN-3926 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 14s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 52s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 25 new + 125 unchanged - 15 fixed = 150 total (was 140) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 15s{color} | {color:green} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api generated 0 new + 0 unchanged - 1 fixed = 0 total (was 1) {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 9s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 13s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 25s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 33s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 2m 26s{color} | {color:red} hadoop-yarn-common in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 42m 56s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 33s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 99m 42s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.util.resource.TestResources | | | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesAppsModification | | | hadoop.yarn.server.resourcemanager.scheduler.fair.TestAppRunnability | | | hadoop.yarn.server.resourcemanager.TestApplicationMasterService | | | hadoop.yarn.server.resourcemanager.scheduler.fifo.TestFifoScheduler | | | hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler | | |
[jira] [Commented] (YARN-6741) Deleting all children of a Parent Queue on refresh throws exception
[ https://issues.apache.org/jira/browse/YARN-6741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16092038#comment-16092038 ] Naganarasimha G R commented on YARN-6741: - [~sunilg], Well i was not thinking of any explicit use case as such, but was just covering all possible scenarios with the modification of the queue hierarchy. And technically there is nothing limiting us in supporting this which aids in admin to manage/reduce the queues with minimal number of operations. > Deleting all children of a Parent Queue on refresh throws exception > --- > > Key: YARN-6741 > URL: https://issues.apache.org/jira/browse/YARN-6741 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Affects Versions: 3.0.0-alpha3 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > Attachments: YARN-6741.001.patch, YARN-6741.002.patch, > YARN-6741.003.patch > > > If we configure CS such that all children of a parent queue are deleted and > made as a leaf queue, then {{refreshQueue}} operation fails when > re-initializing the parent Queue > {code} >// Sanity check > if (!(newlyParsedQueue instanceof ParentQueue) || !newlyParsedQueue > .getQueuePath().equals(getQueuePath())) { > throw new IOException( > "Trying to reinitialize " + getQueuePath() + " from " > + newlyParsedQueue.getQueuePath()); > } > {code} > *Expected Behavior:* > Converting a Parent Queue to leafQueue on refreshQueue needs to be supported. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6798) Fix NM startup failure with old state store due to version mismatch
[ https://issues.apache.org/jira/browse/YARN-6798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Chiang updated YARN-6798: - Summary: Fix NM startup failure with old state store due to version mismatch (was: NM startup failure with old state store due to version mismatch) > Fix NM startup failure with old state store due to version mismatch > --- > > Key: YARN-6798 > URL: https://issues.apache.org/jira/browse/YARN-6798 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0-alpha4 >Reporter: Ray Chiang >Assignee: Botong Huang > Attachments: YARN-6798.v1.patch, YARN-6798.v2.patch > > > YARN-6703 rolled back the state store version number for the RM from 2.0 to > 1.4. > YARN-6127 bumped the version for the NM to 3.0 > private static final Version CURRENT_VERSION_INFO = > Version.newInstance(3, 0); > YARN-5049 bumped the version for the NM to 2.0 > private static final Version CURRENT_VERSION_INFO = > Version.newInstance(2, 0); > During an upgrade, all NMs died after upgrading a C6 cluster from alpha2 to > alpha4. > {noformat} > 2017-07-07 15:48:17,259 FATAL > org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting > NodeManager > org.apache.hadoop.service.ServiceStateException: java.io.IOException: > Incompatible version for NM state: expecting NM state version 3.0, but > loading version 2.0 > at > org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:172) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartRecoveryStore(NodeManager.java:246) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:307) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:748) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:809) > Caused by: java.io.IOException: Incompatible version for NM state: expecting > NM state version 3.0, but loading version 2.0 > at > org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.checkVersion(NMLeveldbStateStoreService.java:1454) > at > org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.initStorage(NMLeveldbStateStoreService.java:1308) > at > org.apache.hadoop.yarn.server.nodemanager.recovery.NMStateStoreService.serviceInit(NMStateStoreService.java:307) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > ... 5 more > 2017-07-07 15:48:17,277 INFO > org.apache.hadoop.yarn.server.nodemanager.NodeManager: SHUTDOWN_MSG: > / > SHUTDOWN_MSG: Shutting down NodeManager at xxx.gce.cloudera.com/aa.bb.cc.dd > / > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6830) Support quoted strings for environment variables
[ https://issues.apache.org/jira/browse/YARN-6830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shane Kumpf updated YARN-6830: -- Attachment: (was: YARN-6830.001.patch) > Support quoted strings for environment variables > > > Key: YARN-6830 > URL: https://issues.apache.org/jira/browse/YARN-6830 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Shane Kumpf >Assignee: Shane Kumpf > Attachments: YARN-6830.001.patch > > > There are cases where it is necessary to allow for quoted string literals > within environment variables values when passed via the yarn command line > interface. > For example, consider the follow environment variables for a MR map task. > {{MODE=bar}} > {{IMAGE_NAME=foo}} > {{MOUNTS=/tmp/foo,/tmp/bar}} > When running the MR job, these environment variables are supplied as a comma > delimited string. > {{-Dmapreduce.map.env="MODE=bar,IMAGE_NAME=foo,MOUNTS=/tmp/foo,/tmp/bar"}} > In this case, {{MOUNTS}} will be parsed and added to the task environment as > {{MOUNTS=/tmp/foo}}. Any attempts to quote the embedded comma separated value > results in quote characters becoming part of the value, and parsing still > breaks down at the comma. > This issue is to allow for quoting the comma separated value (escaped double > or single quote). This was mentioned on YARN-4595 and will impact YARN-5534 > as well. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6830) Support quoted strings for environment variables
[ https://issues.apache.org/jira/browse/YARN-6830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shane Kumpf updated YARN-6830: -- Attachment: YARN-6830.001.patch > Support quoted strings for environment variables > > > Key: YARN-6830 > URL: https://issues.apache.org/jira/browse/YARN-6830 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Shane Kumpf >Assignee: Shane Kumpf > Attachments: YARN-6830.001.patch > > > There are cases where it is necessary to allow for quoted string literals > within environment variables values when passed via the yarn command line > interface. > For example, consider the follow environment variables for a MR map task. > {{MODE=bar}} > {{IMAGE_NAME=foo}} > {{MOUNTS=/tmp/foo,/tmp/bar}} > When running the MR job, these environment variables are supplied as a comma > delimited string. > {{-Dmapreduce.map.env="MODE=bar,IMAGE_NAME=foo,MOUNTS=/tmp/foo,/tmp/bar"}} > In this case, {{MOUNTS}} will be parsed and added to the task environment as > {{MOUNTS=/tmp/foo}}. Any attempts to quote the embedded comma separated value > results in quote characters becoming part of the value, and parsing still > breaks down at the comma. > This issue is to allow for quoting the comma separated value (escaped double > or single quote). This was mentioned on YARN-4595 and will impact YARN-5534 > as well. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-4455) Support fetching metrics by time range
[ https://issues.apache.org/jira/browse/YARN-4455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16091980#comment-16091980 ] Varun Saxena edited comment on YARN-4455 at 7/18/17 6:47 PM: - [~vrushalic], bq. We should not be using the supplemented timestamp for other tables, since it will change the meaning of the cell timestamp for entity and application tables. The HBase TTL will not work if we modify the cell timestamp. We would like clean up of old metrics to happen after a certain expiration period. I see. I am not sure how it works in HBase but it is likely that cell timestamp + TTL would be checked during compaction to delete data which means metrics would go away on first compaction. If this is how behavior is, you are correct that we should store original timestamps instead of supplemented timestamps in Application and entity table. As I said above, I do not find it necessary to store supplemented timestamp for these 2 tables. Amount of metric data in Flow run table would anyways be controlled. bq. do you know where this is a common code path for flowrun table and the application/entity tables. Is it during reading? I was referring to code for read/write in ColumnHelper. I was assuming this was intentional. However, your point about TTL is most likely correct. bq. Did you see any issues while trying to write regular timestamps? No. We can return regular timestamps in all cases. I guess we would need to return a truncated timestamp from FlowRunCoprocessor as well, even though we always return single value. I think we can raise another JIRA to handle this. was (Author: varun_saxena): [~vrushalic], bq. We should not be using the supplemented timestamp for other tables, since it will change the meaning of the cell timestamp for entity and application tables. The HBase TTL will not work if we modify the cell timestamp. We would like clean up of old metrics to happen after a certain expiration period. I see. I am not sure how it works in HBase but it is likely that cell timestamp + TTL would be checked during compaction to delete data which means metrics would go away on first compaction. If this is how behavior is, you are correct that we should store original timestamps instead of supplemented timestamps in Application and entity table. As I said above, I do not find it necessary to store supplemented timestamp for these 2 tables. Amount of metric data in Flow run table would anyways be controlled. bq. do you know where this is a common code path for flowrun table and the application/entity tables. Is it during reading? I was referring to code during read/write in ColumnHelper. I was assuming this was intentional. However, your point about TTL is most likely correct. bq. Did you see any issues while trying to write regular timestamps? No. We can return regular timestamps in all cases. I guess we would need to return a truncated timestamp from FlowRunCoprocessor as well, even though we always return single value. I think we can raise another JIRA to handle this. > Support fetching metrics by time range > -- > > Key: YARN-4455 > URL: https://issues.apache.org/jira/browse/YARN-4455 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-5355 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: YARN-5355, yarn-5355-merge-blocker > Attachments: YARN-4455-YARN-5355.01.patch, > YARN-4455-YARN-5355.02.patch, YARN-4455-YARN-5355.03.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4455) Support fetching metrics by time range
[ https://issues.apache.org/jira/browse/YARN-4455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16091980#comment-16091980 ] Varun Saxena commented on YARN-4455: [~vrushalic], bq. We should not be using the supplemented timestamp for other tables, since it will change the meaning of the cell timestamp for entity and application tables. The HBase TTL will not work if we modify the cell timestamp. We would like clean up of old metrics to happen after a certain expiration period. I see. I am not sure how it works in HBase but it is likely that cell timestamp + TTL would be checked during compaction to delete data which means metrics would go away on first compaction. If this is how behavior is, you are correct that we should store original timestamps instead of supplemented timestamps in Application and entity table. As I said above, I do not find it necessary to store supplemented timestamp for these 2 tables. Amount of metric data in Flow run table would anyways be controlled. bq. do you know where this is a common code path for flowrun table and the application/entity tables. Is it during reading? I was referring to code during read/write in ColumnHelper. I was assuming this was intentional. However, your point about TTL is most likely correct. bq. Did you see any issues while trying to write regular timestamps? No. We can return regular timestamps in all cases. I guess we would need to return a truncated timestamp from FlowRunCoprocessor as well, even though we always return single value. I think we can raise another JIRA to handle this. > Support fetching metrics by time range > -- > > Key: YARN-4455 > URL: https://issues.apache.org/jira/browse/YARN-4455 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-5355 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: YARN-5355, yarn-5355-merge-blocker > Attachments: YARN-4455-YARN-5355.01.patch, > YARN-4455-YARN-5355.02.patch, YARN-4455-YARN-5355.03.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6775) CapacityScheduler: Improvements to assignContainers, avoid unnecessary canAssignToUser/Queue calls
[ https://issues.apache.org/jira/browse/YARN-6775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-6775: - Attachment: YARN-6775.branch-2.003.patch [~nroberts], it cannot apply cleanly caused by YARN-5892. I just attached rebased patch. > CapacityScheduler: Improvements to assignContainers, avoid unnecessary > canAssignToUser/Queue calls > -- > > Key: YARN-6775 > URL: https://issues.apache.org/jira/browse/YARN-6775 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler >Affects Versions: 2.8.1, 3.0.0-alpha3 >Reporter: Nathan Roberts >Assignee: Nathan Roberts > Fix For: 3.0.0-beta1 > > Attachments: rmeventprocbusy.png, rpcprocessingtimeschedulerport.png, > YARN-6775.001.patch, YARN-6775.002.patch, YARN-6775.branch-2.002.patch, > YARN-6775.branch-2.003.patch, YARN-6775.branch-2.8.002.patch > > > There are several things in assignContainers() that are done multiple times > even though the result cannot change (canAssignToUser, canAssignToQueue). Add > some local caching to take advantage of this fact. > Will post patch shortly. Patch includes a simple throughput test that > demonstrates when we have users at their user-limit, the number of > NodeUpdateSchedulerEvents we can process can be improved from 13K/sec to > 50K/sec. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-6840) Leverage RMStateStore to store scheduler configuration updates
Wangda Tan created YARN-6840: Summary: Leverage RMStateStore to store scheduler configuration updates Key: YARN-6840 URL: https://issues.apache.org/jira/browse/YARN-6840 Project: Hadoop YARN Issue Type: Sub-task Reporter: Wangda Tan With this change, user doesn't have to setup separate storage system (like LevelDB) to store updates of scheduler configs. And dynamic queue can be used when RM HA is enabled. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6777) Support for ApplicationMasterService processing chain of interceptors
[ https://issues.apache.org/jira/browse/YARN-6777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16091953#comment-16091953 ] Hadoop QA commented on YARN-6777: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 27s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 22s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 32s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 29s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 42m 51s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 31s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 98m 11s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer | | | hadoop.yarn.server.resourcemanager.scheduler.fair.TestFSAppStarvation | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | YARN-6777 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12877834/YARN-6777.005.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle xml | | uname | Linux 9a73f05bacbd 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 0b7afc0 | | Default Java | 1.8.0_131 | | findbugs | v3.1.0-RC1 | | unit |
[jira] [Commented] (YARN-5534) Allow whitelisted volume mounts
[ https://issues.apache.org/jira/browse/YARN-5534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16091952#comment-16091952 ] Eric Badger commented on YARN-5534: --- bq. Can you help me understand the use case here? While there are mounts that will be commonly needed by containers, I'm not sure of any bind mounts that every container will require. I was thinking of the current code where we are bind-mounting "/sys/fs/cgroup" for every container. For my use case, we would always want to bind mount "/var/run/nscd" so that users can do lookups inside of the container and utilize the host's configs and cache. With the current state of affairs over in YARN-4266, if we enter the container as a UID:GID pair, MRAppMaster will fail if we don't bind-mount "/var/run/nscd". bq. Given that these mounts are read-only and wholly at the discretion of the admin, I don't see that it should be much of a risk. I think that I agree with this. The mounts have to be provided by the admin, so if they have malicious content in them, that's on them. > Allow whitelisted volume mounts > > > Key: YARN-5534 > URL: https://issues.apache.org/jira/browse/YARN-5534 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: luhuichun >Assignee: Shane Kumpf > Attachments: YARN-5534.001.patch, YARN-5534.002.patch > > > Introduction > Mounting files or directories from the host is one way of passing > configuration and other information into a docker container. > We could allow the user to set a list of mounts in the environment of > ContainerLaunchContext (e.g. /dir1:/targetdir1,/dir2:/targetdir2). > These would be mounted read-only to the specified target locations. This has > been resolved in YARN-4595 > 2.Problem Definition > Bug mounting arbitrary volumes into a Docker container can be a security risk. > 3.Possible solutions > one approach to provide safe mounts is to allow the cluster administrator to > configure a set of parent directories as white list mounting directories. > Add a property named yarn.nodemanager.volume-mounts.white-list, when > container executor do mount checking, only the allowed directories or > sub-directories can be mounted. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-6839) [YARN-3368] Be able to view application / container logs on the new YARN UI.
Wangda Tan created YARN-6839: Summary: [YARN-3368] Be able to view application / container logs on the new YARN UI. Key: YARN-6839 URL: https://issues.apache.org/jira/browse/YARN-6839 Project: Hadoop YARN Issue Type: Sub-task Reporter: Wangda Tan Currently viewing application/container logs will be redirected to the old UI, we should leverage the new UI capabilities to make a better log-viewing experiences. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6130) [ATSv2 Security] Generate a delegation token for AM when app collector is created and pass it to AM via NM and RM
[ https://issues.apache.org/jira/browse/YARN-6130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-6130: --- Attachment: YARN-6130-YARN-5355.04.patch > [ATSv2 Security] Generate a delegation token for AM when app collector is > created and pass it to AM via NM and RM > - > > Key: YARN-6130 > URL: https://issues.apache.org/jira/browse/YARN-6130 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-5355-merge-blocker > Attachments: YARN-6130-YARN-5355.01.patch, > YARN-6130-YARN-5355.02.patch, YARN-6130-YARN-5355.03.patch, > YARN-6130-YARN-5355.04.patch, YARN-6130-YARN-5355.04.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6130) [ATSv2 Security] Generate a delegation token for AM when app collector is created and pass it to AM via NM and RM
[ https://issues.apache.org/jira/browse/YARN-6130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-6130: --- Attachment: YARN-6130-YARN-5355.04.patch > [ATSv2 Security] Generate a delegation token for AM when app collector is > created and pass it to AM via NM and RM > - > > Key: YARN-6130 > URL: https://issues.apache.org/jira/browse/YARN-6130 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-5355-merge-blocker > Attachments: YARN-6130-YARN-5355.01.patch, > YARN-6130-YARN-5355.02.patch, YARN-6130-YARN-5355.03.patch, > YARN-6130-YARN-5355.04.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6741) Deleting all children of a Parent Queue on refresh throws exception
[ https://issues.apache.org/jira/browse/YARN-6741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16091887#comment-16091887 ] Sunil G commented on YARN-6741: --- Thanks [~naganarasimha...@apache.org] for the explanation. I think I understood your thinking behind same. A second thought which was coming to my mind about the ready availability of old ParentQueue as LeafQueue just right after refresh. It will get applications straight after refresh. This was my doubt, I am not very sure about the use case which you were intending. If its fine to convert as LeafQueue along with queue refresh itself, i think its fine. Else we might need to explicitly start that queue, so that admin may get an extra knob to really make that queue operational. I think its purely based on use case and I feel you can share more thoughts on same, how do you intend to make this queue operation for an admin. > Deleting all children of a Parent Queue on refresh throws exception > --- > > Key: YARN-6741 > URL: https://issues.apache.org/jira/browse/YARN-6741 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Affects Versions: 3.0.0-alpha3 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > Attachments: YARN-6741.001.patch, YARN-6741.002.patch, > YARN-6741.003.patch > > > If we configure CS such that all children of a parent queue are deleted and > made as a leaf queue, then {{refreshQueue}} operation fails when > re-initializing the parent Queue > {code} >// Sanity check > if (!(newlyParsedQueue instanceof ParentQueue) || !newlyParsedQueue > .getQueuePath().equals(getQueuePath())) { > throw new IOException( > "Trying to reinitialize " + getQueuePath() + " from " > + newlyParsedQueue.getQueuePath()); > } > {code} > *Expected Behavior:* > Converting a Parent Queue to leafQueue on refreshQueue needs to be supported. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6788) Improve performance of resource profile branch
[ https://issues.apache.org/jira/browse/YARN-6788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil G updated YARN-6788: -- Attachment: YARN-6788-YARN-3926.006.patch Latest patch addresses test failures and its more a cleaner patch. Attaching an initial test report also. I will attach a more detailed doc with graph and other test cases sooner. cc/[~leftnoteasy] [~vvasudev] and [~templedf] *Test Setup (SLS)* Number of simulated node manager = 8000 Node Manager Memory = 128G Number of applications submitted = 435 applications at the beginning of the test Size of container ranged from 1G to 8G Life time of containers are ranged from few secs to minutes. *Test Case 1 (Basic resources: CPU and Memory)* +Description+ Use Dominant Resource Calculator to consider CPU and Memory. This test case is to compare a direct performance between YARN-3926 and trunk on basic resources supported as of today +Test Results+ Run SLS tests in 2 mins, throughput-per-second = #total-allocated-containers / 120. Branch YARN-3926: Around 2200 containers / sec Trunk: Around 2100 containers per / sec. > Improve performance of resource profile branch > -- > > Key: YARN-6788 > URL: https://issues.apache.org/jira/browse/YARN-6788 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Sunil G >Assignee: Sunil G >Priority: Blocker > Attachments: YARN-6788-YARN-3926.001.patch, > YARN-6788-YARN-3926.002.patch, YARN-6788-YARN-3926.003.patch, > YARN-6788-YARN-3926.004.patch, YARN-6788-YARN-3926.005.patch, > YARN-6788-YARN-3926.006.patch > > > Currently we could see a 15% performance delta with this branch. > Few performance improvements to improve the same. > Also this patch will handle > [comments|https://issues.apache.org/jira/browse/YARN-6761?focusedCommentId=16075418=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16075418] > from [~leftnoteasy]. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5534) Allow whitelisted volume mounts
[ https://issues.apache.org/jira/browse/YARN-5534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16091851#comment-16091851 ] Daniel Templeton commented on YARN-5534: I agree with the opt-in model guarded by the admin-defined whitelist. I also fail to see the use case for admin-enforced mounts. The nature of a container is that it's inscrutable by the system, so there's no telling what's in there or whether any given mount point makes any sense. Given that these mounts are read-only and wholly at the discretion of the admin, I don't see that it should be much of a risk. The main use case for the feature is to make the Hadoop directory mountable by the container, and I see no risk there. As long as we clearly document the risks in the feature docs, I don't see the need to add training wheels to try to keep admins from shooting themselves in the foot. > Allow whitelisted volume mounts > > > Key: YARN-5534 > URL: https://issues.apache.org/jira/browse/YARN-5534 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: luhuichun >Assignee: Shane Kumpf > Attachments: YARN-5534.001.patch, YARN-5534.002.patch > > > Introduction > Mounting files or directories from the host is one way of passing > configuration and other information into a docker container. > We could allow the user to set a list of mounts in the environment of > ContainerLaunchContext (e.g. /dir1:/targetdir1,/dir2:/targetdir2). > These would be mounted read-only to the specified target locations. This has > been resolved in YARN-4595 > 2.Problem Definition > Bug mounting arbitrary volumes into a Docker container can be a security risk. > 3.Possible solutions > one approach to provide safe mounts is to allow the cluster administrator to > configure a set of parent directories as white list mounting directories. > Add a property named yarn.nodemanager.volume-mounts.white-list, when > container executor do mount checking, only the allowed directories or > sub-directories can be mounted. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4455) Support fetching metrics by time range
[ https://issues.apache.org/jira/browse/YARN-4455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16091847#comment-16091847 ] Vrushali C commented on YARN-4455: -- Yes, I know we have functions to read the supplemented timestamp... This modified timestamp was put in for the flow run table for the coprocessor to be distinguish between metrics from different apps for the same timestamp, as you have mentioned in some comment above. This is to be used only for the cells in the flow run table. I see that you have mentioned "multiplying it by a factor ensures that code path is common while writing". Did you see any issues while trying to write regular timestamps? I am looking but do you know where this is a common code path for flowrun table and the application/entity tables. Is it during reading? The reading of metrics for flow run will not be a timeseries since the flow run table will only keep the latest value of the metric. The reason is that we want only one cell per metric for the flow run across all apps. Having a timeseries for the flow run is a bit more complicated since we have to decide what timestamps we want to keep. The coprocessor will intercept any scan/read (as well as writes) to the table. We should not be using the supplemented timestamp for other tables, since it will change the meaning of the cell timestamp for entity and application tables. The HBase TTL will not work if we modify the cell timestamp. We would like clean up of old metrics to happen after a certain expiration period. Also note that the TS_MULTIPLIER is 100L, not 1000. Do you want to have a call to discuss this? I am good with having a call during IST daytime .. > Support fetching metrics by time range > -- > > Key: YARN-4455 > URL: https://issues.apache.org/jira/browse/YARN-4455 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-5355 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: YARN-5355, yarn-5355-merge-blocker > Attachments: YARN-4455-YARN-5355.01.patch, > YARN-4455-YARN-5355.02.patch, YARN-4455-YARN-5355.03.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6831) Miscellaneous refactoring changes of ContainScheduler
[ https://issues.apache.org/jira/browse/YARN-6831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16091823#comment-16091823 ] Arun Suresh commented on YARN-6831: --- I was thinking about removing *maxOppQueueLength* which led me to think about the following. In YARN-5972, we are trying to get the NM to pause an opportunistic container instead of killing it. Both cgroup freezer and windows job objects implement freezing in the following way: When a process is frozen, it's cpu share is reduced to 0 and its working set remains in memory as long as there is no external memory pressure. If the OS can't keep the frozen process in memory, it's memory is swapped out to disk and restored when the process is thawed. This implies that the number of paused containers is limited to the total swap space on the NM. This should be another local NM config, maybe something like *maxConsumedOpportunisticResources* which places an additional limit on number of running opportunistic containers. > Miscellaneous refactoring changes of ContainScheduler > -- > > Key: YARN-6831 > URL: https://issues.apache.org/jira/browse/YARN-6831 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Haibo Chen >Assignee: Haibo Chen > > While reviewing YARN-6706, Karthik pointed out a few issues for improvment in > ContainerScheduler > *Make ResourceUtilizationTracker pluggable. That way, we could use a > different tracker when oversubscription is enabled. > *ContainerScheduler > ##Why do we need maxOppQueueLength given queuingLimit? > ##Is there value in splitting runningContainers into runningGuaranteed and > runningOpportunistic? > ##getOpportunisticContainersStatus method implementation feels awkward. How > about capturing the state in the field here, and have metrics etc. pull from > here? > ##startContainersFromQueue: Local variable resourcesAvailable is unnecessary > *OpportunisticContainersStatus > ##Let us clearly differentiate between allocated, used and utilized. Maybe, > we should rename current Used methods to Allocated? > ##I prefer either full name Opportunistic (in method) or Opp (shortest name > that makes sense). Opport is neither short nor fully descriptive. > ##Have we considered folding ContainerQueuingLimit class into this? > We decided to move the issues into this follow up jira to keep YARN-6706 > moving forward to unblock oversubscription work. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5534) Allow whitelisted volume mounts
[ https://issues.apache.org/jira/browse/YARN-5534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16091824#comment-16091824 ] Shane Kumpf commented on YARN-5534: --- {quote} So you're proposing having a whitelist of volumes that can be bind-mounted that is defined by the NM and then have the user supply a list of volumes that need to be a subset of that whitelist? {quote} That is correct. The user will opt-in to bind mounts they require, and those bind mount must be in the whitelist (or must be localized resources) for the operation to succeed. {quote} What about volumes that the NM always wants to mount regardless of the user? {quote} Can you help me understand the use case here? While there are mounts that will be commonly needed by containers, I'm not sure of any bind mounts that every container will require. I'd prefer an opt-in model so we don't needless expose host artifacts when they aren't required. However, it wouldn't be very difficult to add this feature, so let me know and I can work to add it. {quote} My question is whether they can leverage these mount points to gain root in the container if minimal capabilities (aka not SETUID/SETGID/etc.) are given. {quote} Great questions. I agree it is possible for them to shoot themselves in the foot, but I don't believe that adding support for bind mounts opens up additional risk with regard to overriding libraries and binaries. Avoiding privileged containers and limiting capabilities is use case dependent, but best practices should be followed to limit the attack surface. Having said that, it seems there could be a need for admins to be able to control the destination mount path within the container. However, the implementation becomes less straight forward for localized resources/distributed cache. Currently we support arbitrary destination paths within the container for localized resources. Consider the hbase container use case, where hbase-site.xml is localized and the hbase processes in the container expect hbase-site.xml to be in /etc/hbase/conf. The admin doesn't know the full path to the localized resources up front, so it wouldn't be possible for the admin to define these localized resources in the whitelist. We could possibly address this through special syntax (i.e. $$LOCALIZED_PATH$$/hbase-site.xml:/etc/hbase/conf/hbase-site.xml:ro") if this is a concern. Thoughts? > Allow whitelisted volume mounts > > > Key: YARN-5534 > URL: https://issues.apache.org/jira/browse/YARN-5534 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: luhuichun >Assignee: Shane Kumpf > Attachments: YARN-5534.001.patch, YARN-5534.002.patch > > > Introduction > Mounting files or directories from the host is one way of passing > configuration and other information into a docker container. > We could allow the user to set a list of mounts in the environment of > ContainerLaunchContext (e.g. /dir1:/targetdir1,/dir2:/targetdir2). > These would be mounted read-only to the specified target locations. This has > been resolved in YARN-4595 > 2.Problem Definition > Bug mounting arbitrary volumes into a Docker container can be a security risk. > 3.Possible solutions > one approach to provide safe mounts is to allow the cluster administrator to > configure a set of parent directories as white list mounting directories. > Add a property named yarn.nodemanager.volume-mounts.white-list, when > container executor do mount checking, only the allowed directories or > sub-directories can be mounted. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6777) Support for ApplicationMasterService processing chain of interceptors
[ https://issues.apache.org/jira/browse/YARN-6777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-6777: -- Attachment: YARN-6777.005.patch Updating patch with: # some javadoc fixes. # checkstyle fixes. # Also fixed the {{TestYarnConfigurationFields}} > Support for ApplicationMasterService processing chain of interceptors > - > > Key: YARN-6777 > URL: https://issues.apache.org/jira/browse/YARN-6777 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: YARN-6777.001.patch, YARN-6777.002.patch, > YARN-6777.003.patch, YARN-6777.004.patch, YARN-6777.005.patch > > > This JIRA extends the Processor introduced in YARN-6776 with a configurable > processing chain of interceptors. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6741) Deleting all children of a Parent Queue on refresh throws exception
[ https://issues.apache.org/jira/browse/YARN-6741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16091807#comment-16091807 ] Naganarasimha G R commented on YARN-6741: - [~sunilg], any more queries ? > Deleting all children of a Parent Queue on refresh throws exception > --- > > Key: YARN-6741 > URL: https://issues.apache.org/jira/browse/YARN-6741 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Affects Versions: 3.0.0-alpha3 >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > Attachments: YARN-6741.001.patch, YARN-6741.002.patch, > YARN-6741.003.patch > > > If we configure CS such that all children of a parent queue are deleted and > made as a leaf queue, then {{refreshQueue}} operation fails when > re-initializing the parent Queue > {code} >// Sanity check > if (!(newlyParsedQueue instanceof ParentQueue) || !newlyParsedQueue > .getQueuePath().equals(getQueuePath())) { > throw new IOException( > "Trying to reinitialize " + getQueuePath() + " from " > + newlyParsedQueue.getQueuePath()); > } > {code} > *Expected Behavior:* > Converting a Parent Queue to leafQueue on refreshQueue needs to be supported. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6838) Add support to LinuxContainerExecutor to support container PAUSE
[ https://issues.apache.org/jira/browse/YARN-6838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-6838: -- Description: This JIRA tracks the changes needed to the {{LinuxContainerExecutor}}, {{LinuxContainerRuntime}}, {{DockerLinuxContainerRuntime}} and the {{container-executor}} linux binary to support container PAUSE using cgroups freezer module (was: This JIRA tracks the changes needed in the container-executor linux binary to support container PAUSE using cgroups freezer module) > Add support to LinuxContainerExecutor to support container PAUSE > > > Key: YARN-6838 > URL: https://issues.apache.org/jira/browse/YARN-6838 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh > > This JIRA tracks the changes needed to the {{LinuxContainerExecutor}}, > {{LinuxContainerRuntime}}, {{DockerLinuxContainerRuntime}} and the > {{container-executor}} linux binary to support container PAUSE using cgroups > freezer module -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6838) Add support to LinuxContainerExecutor to support container PAUSE
[ https://issues.apache.org/jira/browse/YARN-6838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-6838: -- Summary: Add support to LinuxContainerExecutor to support container PAUSE (was: Add support to linux container-executor to support container PAUSE) > Add support to LinuxContainerExecutor to support container PAUSE > > > Key: YARN-6838 > URL: https://issues.apache.org/jira/browse/YARN-6838 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh > > This JIRA tracks the changes needed in the container-executor linux binary to > support container PAUSE using cgroups freezer module -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6838) Add support to linux container-executor to support container PAUSE
[ https://issues.apache.org/jira/browse/YARN-6838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-6838: -- Summary: Add support to linux container-executor to support container PAUSE (was: Add support to linux container-executor to support container freezing ) > Add support to linux container-executor to support container PAUSE > -- > > Key: YARN-6838 > URL: https://issues.apache.org/jira/browse/YARN-6838 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh > > This JIRA tracks the changes needed in the container-executor linux binary to > support container PAUSE using cgroups freezer module -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6838) Add support to linux container-executor to support container freezing
[ https://issues.apache.org/jira/browse/YARN-6838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-6838: -- Description: This JIRA tracks the changes needed in the container-executor linux binary to support container PAUSE using cgroups freezer module > Add support to linux container-executor to support container freezing > -- > > Key: YARN-6838 > URL: https://issues.apache.org/jira/browse/YARN-6838 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh > > This JIRA tracks the changes needed in the container-executor linux binary to > support container PAUSE using cgroups freezer module -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-6838) Add support to linux container-executor to support container freezing
Arun Suresh created YARN-6838: - Summary: Add support to linux container-executor to support container freezing Key: YARN-6838 URL: https://issues.apache.org/jira/browse/YARN-6838 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun Suresh -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6833) On branch-2 ResourceManager failed to start
[ https://issues.apache.org/jira/browse/YARN-6833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16091758#comment-16091758 ] Jason Lowe commented on YARN-6833: -- I'm not able to reproduce this. Usually a NSM error like this implies the jars that were provided at runtime do not match the ones used at compile time. Possibly something came along after the compile that corrupted your build? > On branch-2 ResourceManager failed to start > --- > > Key: YARN-6833 > URL: https://issues.apache.org/jira/browse/YARN-6833 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.9 >Reporter: Junping Du >Priority: Blocker > > On build against branch-2, ResourceManager get failed to start because of > following failures: > {noformat} > 2017-07-16 23:33:15,688 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting > ResourceManager > java.lang.NoSuchMethodError: > org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.ContainerAllocationExpirer.setMonitorInterval(I)V > at > org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.ContainerAllocationExpirer.serviceInit(ContainerAllocationExpirer.java:44) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:684) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1005) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:285) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1283) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5534) Allow whitelisted volume mounts
[ https://issues.apache.org/jira/browse/YARN-5534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16091749#comment-16091749 ] Eric Badger commented on YARN-5534: --- bq. The admin will define a comma separated list of : (ro or rw) mounts, the requesting user will supply :: - mode must be equal to or lesser than the admin defined mode (i.e. admin defines mount as rw, user can bind mount as rw OR ro). I'm not sure I understand this correctly. Let me know if I have this right. So you're proposing having a whitelist of volumes that can be bind-mounted that is defined by the NM and then have the user supply a list of volumes that need to be a subset of that whitelist? What about volumes that the NM always wants to mount regardless of the user? bq. One question here, does any feel there is value in allowing the admin to restrict the destination mount point within the container? Well they could certainly shoot themselves in the foot pretty easily by mounting over an important directory within the image (e.g. /bin), but I'm not sure if that will ever lead to anything that could prove malicious. Maybe a possibility is that they overwrite /bin with their mount that has a bunch of crafted malicious binaries. Though I'm not sure how they would get the malicious binaries in the src volume on the node. And also, I'm not sure if this is anything different/worse than putting a setuid binary in the distributed cache. Or another possibility would be overwriting glibc with a malicious version. Basically allowing arbitrary mount points allows the user to overwrite things owned by root, which makes me a little uneasy. My question is whether they can leverage these mount points to gain root in the container if minimal capabilities (aka not SETUID/SETGID/etc.) are given. > Allow whitelisted volume mounts > > > Key: YARN-5534 > URL: https://issues.apache.org/jira/browse/YARN-5534 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: luhuichun >Assignee: Shane Kumpf > Attachments: YARN-5534.001.patch, YARN-5534.002.patch > > > Introduction > Mounting files or directories from the host is one way of passing > configuration and other information into a docker container. > We could allow the user to set a list of mounts in the environment of > ContainerLaunchContext (e.g. /dir1:/targetdir1,/dir2:/targetdir2). > These would be mounted read-only to the specified target locations. This has > been resolved in YARN-4595 > 2.Problem Definition > Bug mounting arbitrary volumes into a Docker container can be a security risk. > 3.Possible solutions > one approach to provide safe mounts is to allow the cluster administrator to > configure a set of parent directories as white list mounting directories. > Add a property named yarn.nodemanager.volume-mounts.white-list, when > container executor do mount checking, only the allowed directories or > sub-directories can be mounted. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-6837) When the LocalResource's visibility is null, the NodeManager will shutdown
[ https://issues.apache.org/jira/browse/YARN-6837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe reassigned YARN-6837: Assignee: Jinjiang Ling Thanks for the report and the patch! Looking at the patch, I'm not a fan of letting an NPE occur then catching it and assuming we know where the NPE came from. It's error prone for maintenance since someone could accidentally introduce another NPE problem and then we are catching and suppressing for the wrong reason making things harder to debug. Speaking of repressing exceptions, this simply logs a warning when we have no visibility, but then it just continues. What will happen to the resource after that? It doesn't look like we add it to any localizer list and therefore I think the container will just hang waiting for a resource to localize that never will. A better way to handle this is to sanity-check the container launch request in ContainerManagerImpl#startContainerInternal and throw an exception if the request is malformed. This has the benefit of propagating the error back to the client who is making the bad request so they know both that the request was bad and the corresponding container will not be launched. This looks similar to YARN-6403, and the resource visibility was missed in that change. > When the LocalResource's visibility is null, the NodeManager will shutdown > -- > > Key: YARN-6837 > URL: https://issues.apache.org/jira/browse/YARN-6837 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0-alpha4 >Reporter: Jinjiang Ling >Assignee: Jinjiang Ling > Attachments: YARN-6837.patch > > > When I write an yarn application, I create a LocalResource like this > {quote} > LocalResource resource = Records.newRecord(LocalResource.class); > {quote} > Because I forget to set the visibilty of it, so the job is failed when I > submit it. > But NodeManager shutdown one by one at the same time, and there is > NullPointerExceptionin NodeManager's log: > {quote} > 2017-07-18 17:54:09,289 INFO > org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=hadoop > IP=10.43.156.177OPERATION=Start Container Request > TARGET=ContainerManageImpl RESULT=SUCCESS > APPID=application_1499221670783_0067 > CONTAINERID=container_1499221670783_0067_02_03 > 2017-07-18 17:54:09,292 FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: > Error in dispatcher thread > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceSet.addResources(ResourceSet.java:84) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl$RequestResourcesTransition.transition(ContainerImpl.java:868) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl$RequestResourcesTransition.transition(ContainerImpl.java:819) > at > org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:1684) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:96) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:1418) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:1411) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126) > at java.lang.Thread.run(Thread.java:745) > 2017-07-18 17:54:09,292 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: > Start request for container_1499221670783_0067_02_02 by user hadoop > {quote} > Then I change my code and still set the visibility to null > {quote} > LocalResource resource = LocalResource.newInstance( > URL.fromURI(dst.toUri()), > LocalResourceType.FILE, > {color:red}null{color}, > fileStatus.getLen(), > fileStatus.getModificationTime()); > {quote} >
[jira] [Updated] (YARN-5920) Fix deadlock in TestRMHA.testTransitionedToStandbyShouldNotHang
[ https://issues.apache.org/jira/browse/YARN-5920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-5920: - Fix Version/s: 2.8.2 Thanks, Varun! I committed this to branch-2.8 and branch-2.8.2 as well. > Fix deadlock in TestRMHA.testTransitionedToStandbyShouldNotHang > --- > > Key: YARN-5920 > URL: https://issues.apache.org/jira/browse/YARN-5920 > Project: Hadoop YARN > Issue Type: Bug > Components: test >Reporter: Rohith Sharma K S >Assignee: Varun Saxena > Fix For: 2.9.0, 3.0.0-alpha2, 2.8.2 > > Attachments: ThreadDump.txt, YARN-5920.01.patch, YARN-5920.02.patch > > > In build > [linkg|https://builds.apache.org/job/PreCommit-YARN-Build/13986/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt] > test case timed out. This need to be investigated. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6819) Application report fails if app rejected due to nodesize
[ https://issues.apache.org/jira/browse/YARN-6819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16091497#comment-16091497 ] Hadoop QA commented on YARN-6819: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 29s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 1 new + 208 unchanged - 0 fixed = 209 total (was 208) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 44m 4s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 67m 31s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:14b5c93 | | JIRA Issue | YARN-6819 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12877781/YARN-6819.004.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 64d6edfb5566 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 0b7afc0 | | Default Java | 1.8.0_131 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/16478/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/16478/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/16478/console | | Powered by | Apache Yetus 0.6.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Application report fails if app rejected due to nodesize > > > Key: YARN-6819 > URL:
[jira] [Updated] (YARN-6819) Application report fails if app rejected due to nodesize
[ https://issues.apache.org/jira/browse/YARN-6819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-6819: --- Attachment: YARN-6819.004.patch Attaching patch handling comments from [~sunilg] > Application report fails if app rejected due to nodesize > > > Key: YARN-6819 > URL: https://issues.apache.org/jira/browse/YARN-6819 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Attachments: YARN-6819.001.patch, YARN-6819.002.patch, > YARN-6819.003.patch, YARN-6819.004.patch > > > In YARN-5006 application rejected when nodesize limit is exceeded. > {{FinalSavingTransition}} stateBeforeFinalSaving not set after skipping save > to store which causes application report failure -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6837) When the LocalResource's visibility is null, the NodeManager will shutdown
[ https://issues.apache.org/jira/browse/YARN-6837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jinjiang Ling updated YARN-6837: Affects Version/s: 3.0.0-alpha4 > When the LocalResource's visibility is null, the NodeManager will shutdown > -- > > Key: YARN-6837 > URL: https://issues.apache.org/jira/browse/YARN-6837 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0-alpha4 >Reporter: Jinjiang Ling > Attachments: YARN-6837.patch > > > When I write an yarn application, I create a LocalResource like this > {quote} > LocalResource resource = Records.newRecord(LocalResource.class); > {quote} > Because I forget to set the visibilty of it, so the job is failed when I > submit it. > But NodeManager shutdown one by one at the same time, and there is > NullPointerExceptionin NodeManager's log: > {quote} > 2017-07-18 17:54:09,289 INFO > org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=hadoop > IP=10.43.156.177OPERATION=Start Container Request > TARGET=ContainerManageImpl RESULT=SUCCESS > APPID=application_1499221670783_0067 > CONTAINERID=container_1499221670783_0067_02_03 > 2017-07-18 17:54:09,292 FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: > Error in dispatcher thread > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceSet.addResources(ResourceSet.java:84) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl$RequestResourcesTransition.transition(ContainerImpl.java:868) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl$RequestResourcesTransition.transition(ContainerImpl.java:819) > at > org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:1684) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:96) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:1418) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:1411) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126) > at java.lang.Thread.run(Thread.java:745) > 2017-07-18 17:54:09,292 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: > Start request for container_1499221670783_0067_02_02 by user hadoop > {quote} > Then I change my code and still set the visibility to null > {quote} > LocalResource resource = LocalResource.newInstance( > URL.fromURI(dst.toUri()), > LocalResourceType.FILE, > {color:red}null{color}, > fileStatus.getLen(), > fileStatus.getModificationTime()); > {quote} > This error still happen. > At last I set the visibility to correct value, the error do not happen again. > So I think the visibility of LocalResource is null will cause NodeManager > shutdown. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6837) When the LocalResource's visibility is null, the NodeManager will shutdown
[ https://issues.apache.org/jira/browse/YARN-6837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jinjiang Ling updated YARN-6837: Attachment: YARN-6837.patch Attach a patch to avoid this error. > When the LocalResource's visibility is null, the NodeManager will shutdown > -- > > Key: YARN-6837 > URL: https://issues.apache.org/jira/browse/YARN-6837 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jinjiang Ling > Attachments: YARN-6837.patch > > > When I write an yarn application, I create a LocalResource like this > {quote} > LocalResource resource = Records.newRecord(LocalResource.class); > {quote} > Because I forget to set the visibilty of it, so the job is failed when I > submit it. > But NodeManager shutdown one by one at the same time, and there is > NullPointerExceptionin NodeManager's log: > {quote} > 2017-07-18 17:54:09,289 INFO > org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=hadoop > IP=10.43.156.177OPERATION=Start Container Request > TARGET=ContainerManageImpl RESULT=SUCCESS > APPID=application_1499221670783_0067 > CONTAINERID=container_1499221670783_0067_02_03 > 2017-07-18 17:54:09,292 FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: > Error in dispatcher thread > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceSet.addResources(ResourceSet.java:84) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl$RequestResourcesTransition.transition(ContainerImpl.java:868) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl$RequestResourcesTransition.transition(ContainerImpl.java:819) > at > org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:1684) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:96) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:1418) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:1411) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126) > at java.lang.Thread.run(Thread.java:745) > 2017-07-18 17:54:09,292 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: > Start request for container_1499221670783_0067_02_02 by user hadoop > {quote} > Then I change my code and still set the visibility to null > {quote} > LocalResource resource = LocalResource.newInstance( > URL.fromURI(dst.toUri()), > LocalResourceType.FILE, > {color:red}null{color}, > fileStatus.getLen(), > fileStatus.getModificationTime()); > {quote} > This error still happen. > At last I set the visibility to correct value, the error do not happen again. > So I think the visibility of LocalResource is null will cause NodeManager > shutdown. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-6837) When the LocalResource's visibility is null, the NodeManager will shutdown
Jinjiang Ling created YARN-6837: --- Summary: When the LocalResource's visibility is null, the NodeManager will shutdown Key: YARN-6837 URL: https://issues.apache.org/jira/browse/YARN-6837 Project: Hadoop YARN Issue Type: Bug Reporter: Jinjiang Ling When I write an yarn application, I create a LocalResource like this {quote} LocalResource resource = Records.newRecord(LocalResource.class); {quote} Because I forget to set the visibilty of it, so the job is failed when I submit it. But NodeManager shutdown one by one at the same time, and there is NullPointerExceptionin NodeManager's log: {quote} 2017-07-18 17:54:09,289 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=hadoop IP=10.43.156.177OPERATION=Start Container Request TARGET=ContainerManageImpl RESULT=SUCCESS APPID=application_1499221670783_0067 CONTAINERID=container_1499221670783_0067_02_03 2017-07-18 17:54:09,292 FATAL org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread java.lang.NullPointerException at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceSet.addResources(ResourceSet.java:84) at org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl$RequestResourcesTransition.transition(ContainerImpl.java:868) at org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl$RequestResourcesTransition.transition(ContainerImpl.java:819) at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) at org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:1684) at org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl.handle(ContainerImpl.java:96) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:1418) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:1411) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126) at java.lang.Thread.run(Thread.java:745) 2017-07-18 17:54:09,292 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Start request for container_1499221670783_0067_02_02 by user hadoop {quote} Then I change my code and still set the visibility to null {quote} LocalResource resource = LocalResource.newInstance( URL.fromURI(dst.toUri()), LocalResourceType.FILE, {color:red}null{color}, fileStatus.getLen(), fileStatus.getModificationTime()); {quote} This error still happen. At last I set the visibility to correct value, the error do not happen again. So I think the visibility of LocalResource is null will cause NodeManager shutdown. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6819) Application report fails if app rejected due to nodesize
[ https://issues.apache.org/jira/browse/YARN-6819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16091416#comment-16091416 ] Bibin A Chundatt commented on YARN-6819: {quote} If its an authorizing entity, similar to RM, I guess accept or reject apps/requests/connections make more meaning. {quote} In the current CS code when limit is reached APP_REJECTED event is triggered which is not an authorizing entity, thats the reason named APP_SAVE_REJECTED in same pattern. > Application report fails if app rejected due to nodesize > > > Key: YARN-6819 > URL: https://issues.apache.org/jira/browse/YARN-6819 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Attachments: YARN-6819.001.patch, YARN-6819.002.patch, > YARN-6819.003.patch > > > In YARN-5006 application rejected when nodesize limit is exceeded. > {{FinalSavingTransition}} stateBeforeFinalSaving not set after skipping save > to store which causes application report failure -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5731) Preemption calculation is not accurate when reserved containers are present in queue.
[ https://issues.apache.org/jira/browse/YARN-5731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16091307#comment-16091307 ] Hudson commented on YARN-5731: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #12024 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/12024/]) Addendum patch for YARN-5731 (sunilg: rev 0b7afc060c2024a882bd1934d0f722bfca731742) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerSurgicalPreemption.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/ProportionalCapacityPreemptionPolicy.java > Preemption calculation is not accurate when reserved containers are present > in queue. > - > > Key: YARN-5731 > URL: https://issues.apache.org/jira/browse/YARN-5731 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 2.8.0 >Reporter: Sunil G >Assignee: Wangda Tan > Fix For: 2.9.0, 3.0.0-beta1 > > Attachments: YARN-5731.001.patch, YARN-5731.002.patch, > YARN-5731.addendum.003.patch, YARN-5731.addendum.004.patch, > YARN-5731.branch-2.002.patch, YARN-5731-branch-2.8.001.patch > > > YARN Capacity Scheduler does not kick Preemption under below scenario. > Two queues A and B each with 50% capacity and 100% maximum capacity and user > limit factor 2. Minimum Container size is 1536MB and total cluster resource > is 40GB. Now submit the first job which needs 1536MB for AM and 9 task > containers each 4.5GB to queue A. Job will get 8 containers total (AM 1536MB > + 7 * 4.5GB = 33GB) and the cluster usage is 93.8% and the job has reserved a > container of 4.5GB. > Now when next job (1536MB for AM and 9 task containers each 4.5GB) is > submitted onto queue B. The job hangs in ACCEPTED state forever and RM > scheduler never kicks in Preemption. (RM UI Image 2 attached) > Test Case: > ./spark-submit --class org.apache.spark.examples.SparkPi --master yarn-client > --queue A --executor-memory 4G --executor-cores 4 --num-executors 9 > ../lib/spark-examples*.jar 100 > After a minute.. > ./spark-submit --class org.apache.spark.examples.SparkPi --master yarn-client > --queue B --executor-memory 4G --executor-cores 4 --num-executors 9 > ../lib/spark-examples*.jar 100 > Credit to: [~Prabhu Joseph] for bug investigation and troubleshooting. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6819) Application report fails if app rejected due to nodesize
[ https://issues.apache.org/jira/browse/YARN-6819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16091264#comment-16091264 ] Sunil G commented on YARN-6819: --- [~bibinchundatt] bq.Is it a mandatory fix or suggestion?? Sorry. I didnt get your point. To add more clarity, lemme try to rephrase my point again. Usually a store or similar entity does not accept or reject any op. Its usually said as saved or failure to save. Hence I coined that term as a failure. If its an authorizing entity, similar to RM, I guess accept or reject apps/requests/connections make more meaning. > Application report fails if app rejected due to nodesize > > > Key: YARN-6819 > URL: https://issues.apache.org/jira/browse/YARN-6819 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Attachments: YARN-6819.001.patch, YARN-6819.002.patch, > YARN-6819.003.patch > > > In YARN-5006 application rejected when nodesize limit is exceeded. > {{FinalSavingTransition}} stateBeforeFinalSaving not set after skipping save > to store which causes application report failure -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-4455) Support fetching metrics by time range
[ https://issues.apache.org/jira/browse/YARN-4455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16091246#comment-16091246 ] Varun Saxena edited comment on YARN-4455 at 7/18/17 7:54 AM: - [~vrushalic], we do divide it by TS_MULTIPLIER while reading it back. bq. I am wondering if this might be a concern for entity or application tables. When we multiply the timestamp by TimestampGenerator#TS_MULTIPLIER , I am wondering if the timestamp meaning is changing. Also if it will roll over and may not mean the right thing. bq. For example, if the metrics data was written with timestamp of today at 3pm, the multiplier will move it to another timestamp. Refer to ColumnHelper#readResultsWithTimestamps where we call TimestampGenerator#getTruncatedTimestamp to divide the retrieved timestamp by 1000. was (Author: varun_saxena): [~vrushalic], we do divide it by TS_MULTIPLIER while reading it back. bq. I am wondering if this might be a concern for entity or application tables. When we multiply the timestamp by TimestampGenerator#TS_MULTIPLIER , I am wondering if the timestamp meaning is changing. Also if it will roll over and may not mean the right thing. For example, if the metrics data was written with timestamp of today at 3pm, the multiplier will move it to another timestamp. Refer to ColumnHelper#readResultsWithTimestamps where we call TimestampGenerator#getTruncatedTimestamp to divide the retrieved timestamp by 1000. > Support fetching metrics by time range > -- > > Key: YARN-4455 > URL: https://issues.apache.org/jira/browse/YARN-4455 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-5355 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: YARN-5355, yarn-5355-merge-blocker > Attachments: YARN-4455-YARN-5355.01.patch, > YARN-4455-YARN-5355.02.patch, YARN-4455-YARN-5355.03.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4455) Support fetching metrics by time range
[ https://issues.apache.org/jira/browse/YARN-4455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16091246#comment-16091246 ] Varun Saxena commented on YARN-4455: [~vrushalic], we do divide it by TS_MULTIPLIER while reading it back. bq. I am wondering if this might be a concern for entity or application tables. When we multiply the timestamp by TimestampGenerator#TS_MULTIPLIER , I am wondering if the timestamp meaning is changing. Also if it will roll over and may not mean the right thing. For example, if the metrics data was written with timestamp of today at 3pm, the multiplier will move it to another timestamp. Refer to ColumnHelper#readResultsWithTimestamps where we call TimestampGenerator#getTruncatedTimestamp to divide the retrieved timestamp by 1000. > Support fetching metrics by time range > -- > > Key: YARN-4455 > URL: https://issues.apache.org/jira/browse/YARN-4455 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-5355 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: YARN-5355, yarn-5355-merge-blocker > Attachments: YARN-4455-YARN-5355.01.patch, > YARN-4455-YARN-5355.02.patch, YARN-4455-YARN-5355.03.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6819) Application report fails if app rejected due to nodesize
[ https://issues.apache.org/jira/browse/YARN-6819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16091216#comment-16091216 ] Bibin A Chundatt commented on YARN-6819: [~sunilg] FAILED more like attempted then FAILED due to some issue. Here we are preventing the FAILURE case by rejecting before saving. I would prefer REJECTED by store. Is it a mandatory fix or suggestion?? > Application report fails if app rejected due to nodesize > > > Key: YARN-6819 > URL: https://issues.apache.org/jira/browse/YARN-6819 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Attachments: YARN-6819.001.patch, YARN-6819.002.patch, > YARN-6819.003.patch > > > In YARN-5006 application rejected when nodesize limit is exceeded. > {{FinalSavingTransition}} stateBeforeFinalSaving not set after skipping save > to store which causes application report failure -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-6733) Add table for storing sub-application entities
[ https://issues.apache.org/jira/browse/YARN-6733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16091193#comment-16091193 ] Hadoop QA commented on YARN-6733: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 22s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} YARN-5355 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 52s{color} | {color:green} YARN-5355 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s{color} | {color:green} YARN-5355 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s{color} | {color:green} YARN-5355 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s{color} | {color:green} YARN-5355 passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 38s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase in YARN-5355 has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s{color} | {color:green} YARN-5355 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 13s{color} | {color:green} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase: The patch generated 0 new + 0 unchanged - 7 fixed = 0 total (was 7) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 41s{color} | {color:green} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase generated 0 new + 0 unchanged - 1 fixed = 0 total (was 1) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 24s{color} | {color:green} hadoop-yarn-server-timelineservice-hbase in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 23m 33s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ac17dc | | JIRA Issue | YARN-6733 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12877735/YARN-6733-YARN-5355.005.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 4ddcdde82f32 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | YARN-5355 / 5791ced | | Default Java | 1.8.0_131 | | findbugs | v3.1.0-RC1 | | findbugs | https://builds.apache.org/job/PreCommit-YARN-Build/16477/artifact/patchprocess/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice-hbase-warnings.html | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/16477/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/16477/console | | Powered by |
[jira] [Commented] (YARN-6767) Timeline client won't be able to write when TimelineCollector is not up yet, or NM is down
[ https://issues.apache.org/jira/browse/YARN-6767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16091187#comment-16091187 ] Vrushali C commented on YARN-6767: -- Here is my observation in one case. I started up a job and then killed the NM of the node that the AM was running on. The job ran successfully and I also have an history file. I see the following error messages in the timeline service context in the AM log. {code} 2017-07-18 06:31:55,772 ERROR [pool-8-thread-1] org.apache.hadoop.yarn.client.api.impl.TimelineV2ClientImpl: TimelineClient has reached to max retry times : 30 for service address: hostname:port 2017-07-18 06:31:55,773 ERROR [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Failed to process Event JOB_FINISHED for the job : job_1500067716904_0256 org.apache.hadoop.yarn.exceptions.YarnException: Failed while publishing entity at org.apache.hadoop.yarn.client.api.impl.TimelineV2ClientImpl$TimelineEntityDispatcher.dispatchEntities(TimelineV2ClientImpl.java:425) at org.apache.hadoop.yarn.client.api.impl.TimelineV2ClientImpl.putEntities(TimelineV2ClientImpl.java:121) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.processEventForNewTimelineService(JobHistoryEventHandler.java:1289) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler.handleEvent(JobHistoryEventHandler.java:590) at org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler$1.run(JobHistoryEventHandler.java:339) at java.lang.Thread.run(Thread.java:748) Caused by: java.io.IOException: TimelineClient has reached to max retry times : 30 for service address: hostname:port at org.apache.hadoop.yarn.client.api.impl.TimelineV2ClientImpl.checkRetryWithSleep(TimelineV2ClientImpl.java:179) at org.apache.hadoop.yarn.client.api.impl.TimelineV2ClientImpl.putObjects(TimelineV2ClientImpl.java:151) at org.apache.hadoop.yarn.client.api.impl.TimelineV2ClientImpl$EntitiesHolder$1.call(TimelineV2ClientImpl.java:254) at org.apache.hadoop.yarn.client.api.impl.TimelineV2ClientImpl$EntitiesHolder$1.call(TimelineV2ClientImpl.java:248) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at org.apache.hadoop.yarn.client.api.impl.TimelineV2ClientImpl$TimelineEntityDispatcher$1.publishWithoutBlockingOnQueue(TimelineV2ClientImpl.java:375) at org.apache.hadoop.yarn.client.api.impl.TimelineV2ClientImpl$TimelineEntityDispatcher$1.run(TimelineV2ClientImpl.java:313) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ... 1 more Caused by: java.io.IOException: com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: Connection refused (Connection refused) at org.apache.hadoop.yarn.client.api.impl.TimelineV2ClientImpl.putObjects(TimelineV2ClientImpl.java:195) at org.apache.hadoop.yarn.client.api.impl.TimelineV2ClientImpl.putObjects(TimelineV2ClientImpl.java:147) ... 8 more Caused by: com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: Connection refused (Connection refused) at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149) at com.sun.jersey.api.client.Client.handle(Client.java:648) at com.sun.jersey.api.client.WebResource.handle(WebResource.java:670) at com.sun.jersey.api.client.WebResource.access$200(WebResource.java:74) at com.sun.jersey.api.client.WebResource$Builder.put(WebResource.java:533) at org.apache.hadoop.yarn.client.api.impl.TimelineV2ClientImpl.putObjects(TimelineV2ClientImpl.java:188) ... 9 more Caused by: java.net.ConnectException: Connection refused (Connection refused) at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:589) at sun.net.NetworkClient.doConnect(NetworkClient.java:175) at sun.net.www.http.HttpClient.openServer(HttpClient.java:463) at sun.net.www.http.HttpClient.openServer(HttpClient.java:558) at sun.net.www.http.HttpClient.(HttpClient.java:242) at sun.net.www.http.HttpClient.New(HttpClient.java:339) at sun.net.www.http.HttpClient.New(HttpClient.java:357) at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1202) at
[jira] [Updated] (YARN-6733) Add table for storing sub-application entities
[ https://issues.apache.org/jira/browse/YARN-6733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vrushali C updated YARN-6733: - Attachment: YARN-6733-YARN-5355.005.patch Attaching v005 that updates the documentation info > Add table for storing sub-application entities > -- > > Key: YARN-6733 > URL: https://issues.apache.org/jira/browse/YARN-6733 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Vrushali C >Assignee: Vrushali C > Attachments: IMG_7040.JPG, YARN-6733-YARN-5355.001.patch, > YARN-6733-YARN-5355.002.patch, YARN-6733-YARN-5355.003.patch, > YARN-6733-YARN-5355.004.patch, YARN-6733-YARN-5355.005.patch > > > After a discussion with Tez folks, we have been thinking over introducing a > table to store sub-application information. > For example, if a Tez session runs for a certain period as User X and runs a > few AMs. These AMs accept DAGs from other users. Tez will execute these dags > with a doAs user. ATSv2 should store this information in a new table perhaps > called as "sub_application" table. > This jira tracks the code changes needed for table schema creation. > I will file other jiras for writing to that table, updating the user name > fields to include sub-application user etc. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org