[jira] [Commented] (YARN-10040) DistributedShell test failure on X86 and ARM

2019-12-28 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17004554#comment-17004554
 ] 

Ayush Saxena commented on YARN-10040:
-

Thanx [~bzhaoopenstack] for the report. Seems YARN-9697, could be the reason, I 
reverted it and the tests passes for me.

 

[~abmodi] [~bibinchundatt] [~elgoiri]  you worked on YARN-9697, mind giving a 
check.

> DistributedShell test failure on X86 and ARM
> 
>
> Key: YARN-10040
> URL: https://issues.apache.org/jira/browse/YARN-10040
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications/distributed-shell
> Environment: X86/ARM
> OS: ubuntu1804
> Java 8
>Reporter: zhao bo
>Priority: Major
>
> * 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithOpportunisticContainers
>  * 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithEnforceExecutionType
> Please see the Apache Jenkins Test result:
> [https://builds.apache.org/job/hadoop-multibranch/job/PR-1767/1/testReport/]
>  
> These 2 tests are failed on both X86 and ARM platform.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10064) Potential race conditions in AuxServices

2019-12-28 Thread Roman Leventov (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17004433#comment-17004433
 ] 

Roman Leventov commented on YARN-10064:
---

If somebody wants to explore similar issues: I checked the production classes 
on the first page or this Github search: 
[https://github.com/apache/hadoop/search?p=1=synchronizedMap_q=synchronizedMap]
 and didn't found similar problems in classes other than AuxServices. I didn't 
check the 2nd and 3rd page or results though, neither did I search for 
synchronizedList, synchronizedSet, synchronizedSortedMap, etc. 

> Potential race conditions in AuxServices
> 
>
> Key: YARN-10064
> URL: https://issues.apache.org/jira/browse/YARN-10064
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Roman Leventov
>Priority: Major
>
> There are three places in {{AuxServices}} class that may potentially cause 
> race conditions: {{getServices()}}, {{getServiceRecords()}}, 
> {{handle(AuxServicesEvent event)}}. In the first two methods, synchronized 
> collections are returned from the API. If they are iterated concurrently with 
> the underlying collections being updated inside {{AuxServices}}, 
> non-deterministic behavior may follow.
> In {{handle(AuxServicesEvent event)}}, {{serviceMap.values()}} is actually 
> iterated outside of a critical section on {{serviceMap}} object, though it's 
> unclear from the class itself whether handle() may be called concurrently 
> with the methods on {{AuxServices}} that modify {{serviceMap}} or not. So if 
> this not a bug, adding a comment explaining this would be helpful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10064) Potential race conditions in AuxServices

2019-12-28 Thread Roman Leventov (Jira)
Roman Leventov created YARN-10064:
-

 Summary: Potential race conditions in AuxServices
 Key: YARN-10064
 URL: https://issues.apache.org/jira/browse/YARN-10064
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Roman Leventov


There are three places in {{AuxServices}} class that may potentially cause race 
conditions: {{getServices()}}, {{getServiceRecords()}}, 
{{handle(AuxServicesEvent event)}}. In the first two methods, synchronized 
collections are returned from the API. If they are iterated concurrently with 
the underlying collections being updated inside {{AuxServices}}, 
non-deterministic behavior may follow.

In {{handle(AuxServicesEvent event)}}, {{serviceMap.values()}} is actually 
iterated outside of a critical section on {{serviceMap}} object, though it's 
unclear from the class itself whether handle() may be called concurrently with 
the methods on {{AuxServices}} that modify {{serviceMap}} or not. So if this 
not a bug, adding a comment explaining this would be helpful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3120) YarnException on windows + org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to setup local dirnm-local-dir, which was marked as good.

2019-12-28 Thread Wei Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-3120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17004425#comment-17004425
 ] 

Wei Zhang commented on YARN-3120:
-

I resolved it. I use admin  account  to open the PowerShell  in windows, then i 
start yarn. It  works and the nodemanager has no  warn logs.   

> YarnException on windows + 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to setup local 
> dirnm-local-dir, which was marked as good.
> ---
>
> Key: YARN-3120
> URL: https://issues.apache.org/jira/browse/YARN-3120
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
> Environment: Windows 8 , Hadoop 2.6.0
>Reporter: vaidhyanathan
>Priority: Major
>
> Hi,
> I tried to follow the instructiosn in 
> http://wiki.apache.org/hadoop/Hadoop2OnWindows and have setup 
> hadoop-2.6.0.jar in my windows system.
> I was able to start everything properly but when i try to run the job 
> wordcount as given in the above URL , the job fails with the below exception .
> 15/01/30 12:56:09 INFO localizer.ResourceLocalizationService: Localizer failed
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to setup local 
> di
> r /tmp/hadoop-haremangala/nm-local-dir, which was marked as good.
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.
> ResourceLocalizationService.getInitializedLocalDirs(ResourceLocalizationService.
> java:1372)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.
> ResourceLocalizationService.access$900(ResourceLocalizationService.java:137)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.
> ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java
> :1085)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3120) YarnException on windows + org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to setup local dirnm-local-dir, which was marked as good.

2019-12-28 Thread Wei Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-3120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17004422#comment-17004422
 ] 

Wei Zhang commented on YARN-3120:
-

I  face the same issue  when i start-yarn in Windows10 today. The hadoop 
version is 2.8.3.

The logs in nodemanager as below:
{code:java}
19/12/28 15:49:09 WARN localizer.ResourceLocalizationService: Permissions 
incorrectly set for dir /tmp/hadoop-xxx/nm-local-dir/usercache, should be 
rwxr-xr-x, actual value = rwxrwxr-x
19/12/28 15:49:09 INFO localizer.ResourceLocalizationService: Attempting to 
initialize /tmp/hadoop-xxx/nm-local-dir
19/12/28 15:49:09 WARN localizer.ResourceLocalizationService: Permissions 
incorrectly set for dir /tmp/hadoop-xxx/nm-local-dir/usercache, should be 
rwxr-xr-x, actual value = rwxrwxr-x
19/12/28 15:49:09 WARN localizer.ResourceLocalizationService: Failed to setup 
local dir /tmp/hadoop-xxx/nm-local-dir, which was marked as good.
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Permissions incorrectly 
set for dir /tmp/hadoop-xxx/nm-local-dir/usercache, should be rwxr-xr-x, actual 
value = rwxrwxr-x
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.checkLocalDir(ResourceLocalizationService.java:1562)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.checkAndInitializeLocalDirs(ResourceLocalizationService.java:1530)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$1.onDirsChanged(ResourceLocalizationService.java:271)
at 
org.apache.hadoop.yarn.server.nodemanager.DirectoryCollection.registerDirsChangeListener(DirectoryCollection.java:197)
at 
org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService.registerLocalDirsChangeListener(LocalDirsHandlerService.java:242)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.serviceStart(ResourceLocalizationService.java:372)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at 
org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceStart(ContainerManagerImpl.java:502)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at 
org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:369)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:637)
at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:684)
19/12/28 15:49:09 INFO service.AbstractService: Service 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService
 failed in state STARTED; cause: 
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to setup local 
dir /tmp/hadoop-xxx/nm-local-dir, which was marked as good.
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to setup local 
dir /tmp/hadoop-xxx/nm-local-dir, which was marked as good.
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.checkAndInitializeLocalDirs(ResourceLocalizationService.java:1535)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$1.onDirsChanged(ResourceLocalizationService.java:271)
at 
org.apache.hadoop.yarn.server.nodemanager.DirectoryCollection.registerDirsChangeListener(DirectoryCollection.java:197)
at 
org.apache.hadoop.yarn.server.nodemanager.LocalDirsHandlerService.registerLocalDirsChangeListener(LocalDirsHandlerService.java:242)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.serviceStart(ResourceLocalizationService.java:372)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at 
org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceStart(ContainerManagerImpl.java:502)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at 
org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:369)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at 

[jira] [Commented] (YARN-9708) Add Yarnclient#getDelegationToken API implementation and SecureLogin in router

2019-12-28 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17004420#comment-17004420
 ] 

Hadoop QA commented on YARN-9708:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  9s{color} 
| {color:red} YARN-9708 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-9708 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12975960/Add_getDelegationToken_and_SecureLogin_in_router.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/25317/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Add Yarnclient#getDelegationToken API implementation and SecureLogin in router
> --
>
> Key: YARN-9708
> URL: https://issues.apache.org/jira/browse/YARN-9708
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: router
>Affects Versions: 3.1.1
>Reporter: Xie YiFan
>Assignee: Xie YiFan
>Priority: Minor
> Attachments: Add_getDelegationToken_and_SecureLogin_in_router.patch
>
>
> 1.we use router as proxy to manage multiple cluster which be independent of 
> each other in order to apply unified client. Thus, we implement our 
> customized AMRMProxyPolicy that doesn't broadcast ResourceRequest to other 
> cluster.
> 2.Our production environment need kerberos. But router doesn't support 
> SecureLogin for now.
> https://issues.apache.org/jira/browse/YARN-6539 desn't work. So we 
> improvement it.
> 3.Some framework like oozie would get Token via yarnclient#getDelegationToken 
> which router doesn't support. Our solution is that adding homeCluster to 
> ApplicationSubmissionContextProto & GetDelegationTokenRequestProto. Job would 
> be submitted with specified clusterid so that router knows which cluster to 
> submit this job. Router would get Token from one RM according to specified 
> clusterid when client call getDelegation meanwhile apply some mechanism to 
> save this token in memory.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org