[jira] [Commented] (YARN-10040) DistributedShell test failure on X86 and ARM
[ https://issues.apache.org/jira/browse/YARN-10040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17004554#comment-17004554 ] Ayush Saxena commented on YARN-10040: - Thanx [~bzhaoopenstack] for the report. Seems YARN-9697, could be the reason, I reverted it and the tests passes for me. [~abmodi] [~bibinchundatt] [~elgoiri] you worked on YARN-9697, mind giving a check. > DistributedShell test failure on X86 and ARM > > > Key: YARN-10040 > URL: https://issues.apache.org/jira/browse/YARN-10040 > Project: Hadoop YARN > Issue Type: Bug > Components: applications/distributed-shell > Environment: X86/ARM > OS: ubuntu1804 > Java 8 >Reporter: zhao bo >Priority: Major > > * > org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithOpportunisticContainers > * > org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithEnforceExecutionType > Please see the Apache Jenkins Test result: > [https://builds.apache.org/job/hadoop-multibranch/job/PR-1767/1/testReport/] > > These 2 tests are failed on both X86 and ARM platform. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10064) Potential race conditions in AuxServices
[ https://issues.apache.org/jira/browse/YARN-10064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17004433#comment-17004433 ] Roman Leventov commented on YARN-10064: --- If somebody wants to explore similar issues: I checked the production classes on the first page or this Github search: [https://github.com/apache/hadoop/search?p=1=synchronizedMap_q=synchronizedMap] and didn't found similar problems in classes other than AuxServices. I didn't check the 2nd and 3rd page or results though, neither did I search for synchronizedList, synchronizedSet, synchronizedSortedMap, etc. > Potential race conditions in AuxServices > > > Key: YARN-10064 > URL: https://issues.apache.org/jira/browse/YARN-10064 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Roman Leventov >Priority: Major > > There are three places in {{AuxServices}} class that may potentially cause > race conditions: {{getServices()}}, {{getServiceRecords()}}, > {{handle(AuxServicesEvent event)}}. In the first two methods, synchronized > collections are returned from the API. If they are iterated concurrently with > the underlying collections being updated inside {{AuxServices}}, > non-deterministic behavior may follow. > In {{handle(AuxServicesEvent event)}}, {{serviceMap.values()}} is actually > iterated outside of a critical section on {{serviceMap}} object, though it's > unclear from the class itself whether handle() may be called concurrently > with the methods on {{AuxServices}} that modify {{serviceMap}} or not. So if > this not a bug, adding a comment explaining this would be helpful. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10064) Potential race conditions in AuxServices
Roman Leventov created YARN-10064: - Summary: Potential race conditions in AuxServices Key: YARN-10064 URL: https://issues.apache.org/jira/browse/YARN-10064 Project: Hadoop YARN Issue Type: Bug Reporter: Roman Leventov There are three places in {{AuxServices}} class that may potentially cause race conditions: {{getServices()}}, {{getServiceRecords()}}, {{handle(AuxServicesEvent event)}}. In the first two methods, synchronized collections are returned from the API. If they are iterated concurrently with the underlying collections being updated inside {{AuxServices}}, non-deterministic behavior may follow. In {{handle(AuxServicesEvent event)}}, {{serviceMap.values()}} is actually iterated outside of a critical section on {{serviceMap}} object, though it's unclear from the class itself whether handle() may be called concurrently with the methods on {{AuxServices}} that modify {{serviceMap}} or not. So if this not a bug, adding a comment explaining this would be helpful. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3120) YarnException on windows + org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to setup local dirnm-local-dir, which was marked as good.
[ https://issues.apache.org/jira/browse/YARN-3120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17004425#comment-17004425 ] Wei Zhang commented on YARN-3120: - I resolved it. I use admin account to open the PowerShell in windows, then i start yarn. It works and the nodemanager has no warn logs. > YarnException on windows + > org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to setup local > dirnm-local-dir, which was marked as good. > --- > > Key: YARN-3120 > URL: https://issues.apache.org/jira/browse/YARN-3120 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.0 > Environment: Windows 8 , Hadoop 2.6.0 >Reporter: vaidhyanathan >Priority: Major > > Hi, > I tried to follow the instructiosn in > http://wiki.apache.org/hadoop/Hadoop2OnWindows and have setup > hadoop-2.6.0.jar in my windows system. > I was able to start everything properly but when i try to run the job > wordcount as given in the above URL , the job fails with the below exception . > 15/01/30 12:56:09 INFO localizer.ResourceLocalizationService: Localizer failed > org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to setup local > di > r /tmp/hadoop-haremangala/nm-local-dir, which was marked as good. > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer. > ResourceLocalizationService.getInitializedLocalDirs(ResourceLocalizationService. > java:1372) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer. > ResourceLocalizationService.access$900(ResourceLocalizationService.java:137) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer. > ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java > :1085) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3120) YarnException on windows + org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to setup local dirnm-local-dir, which was marked as good.
[ https://issues.apache.org/jira/browse/YARN-3120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17004422#comment-17004422 ] Wei Zhang commented on YARN-3120: - I face the same issue when i start-yarn in Windows10 today. The hadoop version is 2.8.3. The logs in nodemanager as below: {code:java} 19/12/28 15:49:09 WARN localizer.ResourceLocalizationService: Permissions incorrectly set for dir /tmp/hadoop-xxx/nm-local-dir/usercache, should be rwxr-xr-x, actual value = rwxrwxr-x 19/12/28 15:49:09 INFO localizer.ResourceLocalizationService: Attempting to initialize /tmp/hadoop-xxx/nm-local-dir 19/12/28 15:49:09 WARN localizer.ResourceLocalizationService: Permissions incorrectly set for dir /tmp/hadoop-xxx/nm-local-dir/usercache, should be rwxr-xr-x, actual value = rwxrwxr-x 19/12/28 15:49:09 WARN localizer.ResourceLocalizationService: Failed to setup local dir /tmp/hadoop-xxx/nm-local-dir, which was marked as good. org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Permissions incorrectly set for dir /tmp/hadoop-xxx/nm-local-dir/usercache, should be rwxr-xr-x, actual value = rwxrwxr-x at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.checkLocalDir(ResourceLocalizationService.java:1562) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.checkAndInitializeLocalDirs(ResourceLocalizationService.java:1530) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$1.onDirsChanged(ResourceLocalizationService.java:271) at org.apache.hadoop.yarn.server.nodemanager.DirectoryCollection.registerDirsChangeListener(DirectoryCollection.java:197) at org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService.registerLocalDirsChangeListener(LocalDirsHandlerService.java:242) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.serviceStart(ResourceLocalizationService.java:372) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceStart(ContainerManagerImpl.java:502) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:369) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:637) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:684) 19/12/28 15:49:09 INFO service.AbstractService: Service org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService failed in state STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to setup local dir /tmp/hadoop-xxx/nm-local-dir, which was marked as good. org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to setup local dir /tmp/hadoop-xxx/nm-local-dir, which was marked as good. at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.checkAndInitializeLocalDirs(ResourceLocalizationService.java:1535) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$1.onDirsChanged(ResourceLocalizationService.java:271) at org.apache.hadoop.yarn.server.nodemanager.DirectoryCollection.registerDirsChangeListener(DirectoryCollection.java:197) at org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService.registerLocalDirsChangeListener(LocalDirsHandlerService.java:242) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.serviceStart(ResourceLocalizationService.java:372) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceStart(ContainerManagerImpl.java:502) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:369) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at
[jira] [Commented] (YARN-9708) Add Yarnclient#getDelegationToken API implementation and SecureLogin in router
[ https://issues.apache.org/jira/browse/YARN-9708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17004420#comment-17004420 ] Hadoop QA commented on YARN-9708: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 9s{color} | {color:red} YARN-9708 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-9708 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12975960/Add_getDelegationToken_and_SecureLogin_in_router.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/25317/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Add Yarnclient#getDelegationToken API implementation and SecureLogin in router > -- > > Key: YARN-9708 > URL: https://issues.apache.org/jira/browse/YARN-9708 > Project: Hadoop YARN > Issue Type: New Feature > Components: router >Affects Versions: 3.1.1 >Reporter: Xie YiFan >Assignee: Xie YiFan >Priority: Minor > Attachments: Add_getDelegationToken_and_SecureLogin_in_router.patch > > > 1.we use router as proxy to manage multiple cluster which be independent of > each other in order to apply unified client. Thus, we implement our > customized AMRMProxyPolicy that doesn't broadcast ResourceRequest to other > cluster. > 2.Our production environment need kerberos. But router doesn't support > SecureLogin for now. > https://issues.apache.org/jira/browse/YARN-6539 desn't work. So we > improvement it. > 3.Some framework like oozie would get Token via yarnclient#getDelegationToken > which router doesn't support. Our solution is that adding homeCluster to > ApplicationSubmissionContextProto & GetDelegationTokenRequestProto. Job would > be submitted with specified clusterid so that router knows which cluster to > submit this job. Router would get Token from one RM according to specified > clusterid when client call getDelegation meanwhile apply some mechanism to > save this token in memory. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org