[jira] [Updated] (YARN-9497) Support grouping by diagnostics for query results of scheduler and app activities

2019-05-09 Thread Tao Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-9497:
---
Attachment: YARN-9497.001.patch

> Support grouping by diagnostics for query results of scheduler and app 
> activities
> -
>
> Key: YARN-9497
> URL: https://issues.apache.org/jira/browse/YARN-9497
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-9497.001.patch
>
>
> [Design Doc 
> #4.3|https://docs.google.com/document/d/1pwf-n3BCLW76bGrmNPM4T6pQ3vC4dVMcN2Ud1hq1t2M/edit#heading=h.6fbpge17dmmr]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9522) AppBlock ignores full qualified class name of PseudoAuthenticationHandler

2019-05-09 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836705#comment-16836705
 ] 

Hudson commented on YARN-9522:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #16535 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16535/])
YARN-9522. AppBlock ignores full qualified class name of (gifuma: rev 
1b48100a5e5c6a08b91a9283bc2dbb7725e3236d)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/AppBlock.java


> AppBlock ignores full qualified class name of PseudoAuthenticationHandler
> -
>
> Key: YARN-9522
> URL: https://issues.apache.org/jira/browse/YARN-9522
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: YARN-9522-001.patch, YARN-9522-002.patch
>
>
> {{AuthenticationHandler}} can be either configured using fqcn or type. 
> {{AppBlock}} checks for only the type simple and ignores the fqcn of 
> {{PseudoAuthenticationHandler}} while checking whether ui is secured or not.
> {code}
>* @param authHandler The short-name (or fully qualified class name) of the
>* authentication handler.
> {code}
> *AppBlock.java*
> {code}
> // check if UI is unsecured.
> String httpAuth = 
> conf.get(CommonConfigurationKeys.HADOOP_HTTP_AUTHENTICATION_TYPE);
> this.unsecuredUI = (httpAuth != null) && httpAuth.equals("simple");
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9522) AppBlock ignores full qualified class name of PseudoAuthenticationHandler

2019-05-09 Thread Giovanni Matteo Fumarola (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giovanni Matteo Fumarola updated YARN-9522:
---
Fix Version/s: 3.3.0

> AppBlock ignores full qualified class name of PseudoAuthenticationHandler
> -
>
> Key: YARN-9522
> URL: https://issues.apache.org/jira/browse/YARN-9522
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: YARN-9522-001.patch, YARN-9522-002.patch
>
>
> {{AuthenticationHandler}} can be either configured using fqcn or type. 
> {{AppBlock}} checks for only the type simple and ignores the fqcn of 
> {{PseudoAuthenticationHandler}} while checking whether ui is secured or not.
> {code}
>* @param authHandler The short-name (or fully qualified class name) of the
>* authentication handler.
> {code}
> *AppBlock.java*
> {code}
> // check if UI is unsecured.
> String httpAuth = 
> conf.get(CommonConfigurationKeys.HADOOP_HTTP_AUTHENTICATION_TYPE);
> this.unsecuredUI = (httpAuth != null) && httpAuth.equals("simple");
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9522) AppBlock ignores full qualified class name of PseudoAuthenticationHandler

2019-05-09 Thread Giovanni Matteo Fumarola (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836701#comment-16836701
 ] 

Giovanni Matteo Fumarola commented on YARN-9522:


My bad. [^YARN-9522-001.patch] was ok.

Committed v1 to trunk.
Thanks [~Prabhu Joseph].

> AppBlock ignores full qualified class name of PseudoAuthenticationHandler
> -
>
> Key: YARN-9522
> URL: https://issues.apache.org/jira/browse/YARN-9522
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: YARN-9522-001.patch, YARN-9522-002.patch
>
>
> {{AuthenticationHandler}} can be either configured using fqcn or type. 
> {{AppBlock}} checks for only the type simple and ignores the fqcn of 
> {{PseudoAuthenticationHandler}} while checking whether ui is secured or not.
> {code}
>* @param authHandler The short-name (or fully qualified class name) of the
>* authentication handler.
> {code}
> *AppBlock.java*
> {code}
> // check if UI is unsecured.
> String httpAuth = 
> conf.get(CommonConfigurationKeys.HADOOP_HTTP_AUTHENTICATION_TYPE);
> this.unsecuredUI = (httpAuth != null) && httpAuth.equals("simple");
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9518) can't use CGroups with YARN in centos7

2019-05-09 Thread Jim Brennan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836669#comment-16836669
 ] 

Jim Brennan commented on YARN-9518:
---

[~shurong.mai], i believe that [~jhung] is correct, and this problem was fixed 
in  -YARN-2194.-

The question is, what version of Hadoop are you running?   This should not be a 
problem in any release after 2.8.   The variable LINUX_PATH_SEPARATOR (which is 
{{%}}) is now used as a separator instead of comma, so the comma in the path 
{{/sys/fs/cgroup/cpu,cpuacct}} is handled properly as part of the filename.

If you are running Hadoop version 2.7, then the proper fix would be to pull 
YARN-2194 back to that branch (if applicable).

If you are running 2.8 or later, then is it possible that you have an old 
container-executor binary?   That's the only way I think you would still be 
seeing this.

We have been running on RHEL7 for over a year, which has the same cgroups as 
centos7, and we are not seeing this problem.

 

> can't use CGroups with YARN in centos7 
> ---
>
> Key: YARN-9518
> URL: https://issues.apache.org/jira/browse/YARN-9518
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.2.0, 2.9.2, 2.8.5, 2.7.7, 3.1.2
>Reporter: Shurong Mai
>Priority: Major
>  Labels: cgroup, patch
> Attachments: YARN-9518-branch-2.7.7.001.patch, 
> YARN-9518-trunk.001.patch, YARN-9518.patch
>
>
> The os version is centos7. 
> {code:java}
> cat /etc/redhat-release
> CentOS Linux release 7.3.1611 (Core)
> {code}
> When I had set configuration variables  for cgroup with yarn, nodemanager 
> could be start without any matter. But when I ran a job, the job failed with 
> these exceptional nodemanager logs in the end.
> In these logs, the important logs is " Can't open file /sys/fs/cgroup/cpu as 
> node manager - Is a directory "
> After I analysed, I found the reason. In centos6, the cgroup "cpu" and 
> "cpuacct" subsystem are as follows: 
> {code:java}
> /sys/fs/cgroup/cpu
> /sys/fs/cgroup/cpuacct
> {code}
> But in centos7, as follows:
> {code:java}
> /sys/fs/cgroup/cpu -> cpu,cpuacct
> /sys/fs/cgroup/cpuacct -> cpu,cpuacct
> /sys/fs/cgroup/cpu,cpuacct{code}
> "cpu" and "cpuacct" have merge as "cpu,cpuacct".  "cpu"  and  "cpuacct"  are 
> symbol links. 
> As I look at source code, nodemamager get the cgroup subsystem info by 
> reading /proc/mounts. So It get the cpu and cpuacct subsystem path are also 
> "/sys/fs/cgroup/cpu,cpuacct". 
> The resource description arguments of container-executor is such as follows: 
> {code:java}
> cgroups=/sys/fs/cgroup/cpu,cpuacct/hadoop-yarn/container_1554210318404_0057_02_01/tasks
> {code}
> There is a comma in the cgroup path, but the comma is separator of multi 
> resource. Therefore, the cgroup path is truncated by container-executor as 
> "/sys/fs/cgroup/cpu" rather than correct cgroup path " 
> /sys/fs/cgroup/cpu,cpuacct/hadoop-yarn/container_1554210318404_0057_02_01/tasks
>  " and report the error in the log  " Can't open file /sys/fs/cgroup/cpu as 
> node manager - Is a directory "
> Hence I modify the source code and submit a patch. The idea of patch is that 
> nodemanager get the cgroup cpu path as "/sys/fs/cgroup/cpu" rather than 
> "/sys/fs/cgroup/cpu,cpuacct". As a result, the  resource description 
> arguments of container-executor is such as follows: 
> {code:java}
> cgroups=/sys/fs/cgroup/cpu/hadoop-yarn/container_1554210318404_0057_02_01/tasks
> {code}
> Note that there is no comma in the path, and is a valid path because 
> "/sys/fs/cgroup/cpu" is symbol link to "/sys/fs/cgroup/cpu,cpuacct". 
> After applied the patch, the problem is resolved and the job can run 
> successfully.
> The patch is compatible with  cgroup path of history os version such as 
> centos6, centos7 , and universally applicable to cgroup subsystem paths such 
> as cgroup network subsystem as follows:  
> {code:java}
> /sys/fs/cgroup/net_cls -> net_cls,net_prio
> /sys/fs/cgroup/net_prio -> net_cls,net_prio
> /sys/fs/cgroup/net_cls,net_prio{code}
>  
>  
> ##
> {panel:title=exceptional nodemanager logs:}
> 2019-04-19 20:17:20,095 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl:
>  Container container_1554210318404_0042_01_01 transitioned from LOCALIZED 
> to RUNNING
>  2019-04-19 20:17:20,101 WARN 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Exit code 
> from container container_1554210318404_0042_01_01 is : 27
>  2019-04-19 20:17:20,103 WARN 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Exception 
> from container-launch with container ID: container_155421031840
>  4_0042_01_00

[jira] [Commented] (YARN-9522) AppBlock ignores full qualified class name of PseudoAuthenticationHandler

2019-05-09 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836639#comment-16836639
 ] 

Hadoop QA commented on YARN-9522:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
 9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 56s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 36s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
43s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 50m 11s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | YARN-9522 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12968314/YARN-9522-002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux d93cc7b48257 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 2d31ccc |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/24074/testReport/ |
| Max. process+thread count | 412 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/24074/console |
| Powered by | Apache Yetus 0.8.0   http://yetu

[jira] [Commented] (YARN-9527) Rogue LocalizerRunner/ContainerLocalizer repeatedly downloading same file

2019-05-09 Thread Jim Brennan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836638#comment-16836638
 ] 

Jim Brennan commented on YARN-9527:
---

I was able to repro the problem in branch-2.8 on a one-node-cluster by changing 
ApplicationImpl.AppInitDoneTransition() to immediately send a 
ContainerKillEvent event after first ContainerInitEvent is sent. So it's a 
one-time shot for the NM.

I restart the nodemanager with this change, and then run a sleep job with a 
list of files to localize.
{noformat}
hadoop jar 
$HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-*-tests.jar
 sleep -files 
file1,file2,file3,file4,file5,file6,file7,file8,file9,file10,file11,file12,file13,file14,file15,file16,file17
 -m 10 -r 10 -mt 1 -rt 1
{noformat}
Without my fix, this causes a rogue ContainerLocalizer to get stuck in the 
LOCALIZED at LOCALIZED loop every time. I have verified that my fix prevents 
this.  I have also verified that the fix without the LRUCache portion (just the 
findNextResource change) does not fix the problem (at least for this test case).

> Rogue LocalizerRunner/ContainerLocalizer repeatedly downloading same file
> -
>
> Key: YARN-9527
> URL: https://issues.apache.org/jira/browse/YARN-9527
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.8.5, 3.1.2
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-9527.001.patch, YARN-9527.002.patch, 
> YARN-9527.003.patch, YARN-9527.004.patch
>
>
> A rogue ContainerLocalizer can get stuck in a loop continuously downloading 
> the same file while generating an "Invalid event: LOCALIZED at LOCALIZED" 
> exception on each iteration.  Sometimes this continues long enough that it 
> fills up a disk or depletes available inodes for the filesystem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8622) NodeManager native build fails due to getgrouplist not found on macOS

2019-05-09 Thread Eric Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-8622:

Fix Version/s: 3.1.3

> NodeManager native build fails due to getgrouplist not found on macOS
> -
>
> Key: YARN-8622
> URL: https://issues.apache.org/jira/browse/YARN-8622
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.2.0, 3.3.0
> Environment: Darwin 17.7.0 Darwin Kernel Version 17.7.0: Thu Jun 21 
> 22:53:14 PDT 2018; root:xnu-4570.71.2~1/RELEASE_X86_64 x86_64
> Apple LLVM version 9.1.0 (clang-902.0.39.2)
>Reporter: Ewan Higgs
>Assignee: Siyao Meng
>Priority: Major
> Fix For: 3.3.0, 3.2.1, 3.1.3
>
> Attachments: YARN-8622.001.patch, YARN-8622.002.patch
>
>
> Usage of getgrouplist() is added in YARN-7221 and should affect Hadoop 3.2.0 
> and later.
> Compiler:
> {code}
> $ /Library/Developer/CommandLineTools/usr/bin/c++ --version
> Apple LLVM version 9.1.0 (clang-902.0.39.2)
> Target: x86_64-apple-darwin17.7.0
> Thread model: posix
> InstalledDir: /Library/Developer/CommandLineTools/usr/bin
> {code}
> Build line:
> {code}
> [WARNING] /Library/Developer/CommandLineTools/usr/bin/c++   -g -O2 -Wall 
> -pthread -D_FILE_OFFSET_BITS=64 -Wl,-search_paths_first 
> -Wl,-headerpad_max_install_names   
> CMakeFiles/test-oom-listener.dir/main/native/oom-listener/impl/oom_listener.c.o
>  
> CMakeFiles/test-oom-listener.dir/main/native/oom-listener/test/oom_listener_test_main.cc.o
>   -o test/test-oom-listener libgtest.a -lrt 
> {code}
> Error message: 
> {code}
> ...
> [WARNING] 
> /Users/ehiggs/src/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/utils/docker-util.c:1264:12:
>  error: no matching function for call to 'getgrouplist'
> [WARNING]   int rc = getgrouplist(user, pw->pw_gid, groups, &ngroups);
> [WARNING]^~~~
> [WARNING] /usr/include/unistd.h:653:6: note: candidate function not viable: 
> no known conversion from 'gid_t *' (aka 'unsigned int *') to 'int *' for 3rd 
> argument
> [WARNING] int  getgrouplist(const char *, int, int *, int *);
> [WARNING]  ^
> [WARNING] In file included from 
> /Users/ehiggs/src/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/test/utils/test_docker_util.cc:24:
> [WARNING] 
> /Users/ehiggs/src/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/utils/docker-util.c:1271:9:
>  error: no matching function for call to 'getgrouplist'
> [WARNING] if (getgrouplist(user, pw->pw_gid, groups, &ngroups) == -1) {
> [WARNING] ^~~~
> [WARNING] /usr/include/unistd.h:653:6: note: candidate function not viable: 
> no known conversion from 'gid_t *' (aka 'unsigned int *') to 'int *' for 3rd 
> argument
> [WARNING] int  getgrouplist(const char *, int, int *, int *);
> [WARNING]  ^
> [WARNING] 2 warnings and 2 errors generated.
> [WARNING] make[2]: *** 
> [CMakeFiles/cetest.dir/main/native/container-executor/test/utils/test_docker_util.cc.o]
>  Error 1
> [WARNING] make[1]: *** [CMakeFiles/cetest.dir/all] Error 2
> [WARNING] make: *** [all] Error 2
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8622) NodeManager native build fails due to getgrouplist not found on macOS

2019-05-09 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836628#comment-16836628
 ] 

Eric Yang commented on YARN-8622:
-

[~smeng], pushed this change to branch-3.1.  Thank you.

> NodeManager native build fails due to getgrouplist not found on macOS
> -
>
> Key: YARN-8622
> URL: https://issues.apache.org/jira/browse/YARN-8622
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.2.0, 3.3.0
> Environment: Darwin 17.7.0 Darwin Kernel Version 17.7.0: Thu Jun 21 
> 22:53:14 PDT 2018; root:xnu-4570.71.2~1/RELEASE_X86_64 x86_64
> Apple LLVM version 9.1.0 (clang-902.0.39.2)
>Reporter: Ewan Higgs
>Assignee: Siyao Meng
>Priority: Major
> Fix For: 3.3.0, 3.2.1
>
> Attachments: YARN-8622.001.patch, YARN-8622.002.patch
>
>
> Usage of getgrouplist() is added in YARN-7221 and should affect Hadoop 3.2.0 
> and later.
> Compiler:
> {code}
> $ /Library/Developer/CommandLineTools/usr/bin/c++ --version
> Apple LLVM version 9.1.0 (clang-902.0.39.2)
> Target: x86_64-apple-darwin17.7.0
> Thread model: posix
> InstalledDir: /Library/Developer/CommandLineTools/usr/bin
> {code}
> Build line:
> {code}
> [WARNING] /Library/Developer/CommandLineTools/usr/bin/c++   -g -O2 -Wall 
> -pthread -D_FILE_OFFSET_BITS=64 -Wl,-search_paths_first 
> -Wl,-headerpad_max_install_names   
> CMakeFiles/test-oom-listener.dir/main/native/oom-listener/impl/oom_listener.c.o
>  
> CMakeFiles/test-oom-listener.dir/main/native/oom-listener/test/oom_listener_test_main.cc.o
>   -o test/test-oom-listener libgtest.a -lrt 
> {code}
> Error message: 
> {code}
> ...
> [WARNING] 
> /Users/ehiggs/src/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/utils/docker-util.c:1264:12:
>  error: no matching function for call to 'getgrouplist'
> [WARNING]   int rc = getgrouplist(user, pw->pw_gid, groups, &ngroups);
> [WARNING]^~~~
> [WARNING] /usr/include/unistd.h:653:6: note: candidate function not viable: 
> no known conversion from 'gid_t *' (aka 'unsigned int *') to 'int *' for 3rd 
> argument
> [WARNING] int  getgrouplist(const char *, int, int *, int *);
> [WARNING]  ^
> [WARNING] In file included from 
> /Users/ehiggs/src/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/test/utils/test_docker_util.cc:24:
> [WARNING] 
> /Users/ehiggs/src/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/utils/docker-util.c:1271:9:
>  error: no matching function for call to 'getgrouplist'
> [WARNING] if (getgrouplist(user, pw->pw_gid, groups, &ngroups) == -1) {
> [WARNING] ^~~~
> [WARNING] /usr/include/unistd.h:653:6: note: candidate function not viable: 
> no known conversion from 'gid_t *' (aka 'unsigned int *') to 'int *' for 3rd 
> argument
> [WARNING] int  getgrouplist(const char *, int, int *, int *);
> [WARNING]  ^
> [WARNING] 2 warnings and 2 errors generated.
> [WARNING] make[2]: *** 
> [CMakeFiles/cetest.dir/main/native/container-executor/test/utils/test_docker_util.cc.o]
>  Error 1
> [WARNING] make[1]: *** [CMakeFiles/cetest.dir/all] Error 2
> [WARNING] make: *** [all] Error 2
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9537) Add configuration to support AM preemption

2019-05-09 Thread Yufei Gu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836625#comment-16836625
 ] 

Yufei Gu commented on YARN-9537:


FairScheduler doesn't prevent you from preempting the AM container. It just 
tries to preempt as less AM containers as possible. 

> Add configuration to support AM preemption
> --
>
> Key: YARN-9537
> URL: https://issues.apache.org/jira/browse/YARN-9537
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Reporter: zhoukang
>Priority: Major
>
> In our production cluster, we can tolerate am preemption. So we can add a 
> configuration to support am preemption.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9202) RM does not track nodes that are in the include list and never register

2019-05-09 Thread Eric Payne (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836617#comment-16836617
 ] 

Eric Payne commented on YARN-9202:
--

[~kshukla], thanks for your explanation. I think we should go ahead with your 
current approach. The current patch does not apply, however, so can you please 
provide an upmerged patch?

> RM does not track nodes that are in the include list and never register
> ---
>
> Key: YARN-9202
> URL: https://issues.apache.org/jira/browse/YARN-9202
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.2, 3.0.3, 2.8.5
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
>Priority: Major
> Attachments: YARN-9202.001.patch
>
>
> The RM state machine decides to put new or running nodes in inactive state 
> only past the point of either registration or being in the exclude list. This 
> does not cover the case where a node is the in the include list but never 
> registers and since all state changes are based on these NodeState 
> transitions, having NEW nodes be listed as inactive first may help. This 
> would change the semantics of how inactiveNodes are looked at today. Another 
> state addition might help this case too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9483) DistributedShell does not release container when failed to localize at launch

2019-05-09 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836608#comment-16836608
 ] 

Prabhu Joseph commented on YARN-9483:
-

Thanks [~giovanni.fumarola] and [~pbacsko].

> DistributedShell does not release container when failed to localize at launch
> -
>
> Key: YARN-9483
> URL: https://issues.apache.org/jira/browse/YARN-9483
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9483-001.patch
>
>
> DistributedShell does not release container when failed to localize at 
> launch. The launch threads does not increment completed & failed containers 
> when failed to localize. And the main thread waits for the containers to 
> complete without failing the job.
> {code}
> yarn jar 
> /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar
>  -shell_command ls  -shell_args / -jar  
> /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar
>  -localize_files /tmp/prabhu
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9522) AppBlock ignores full qualified class name of PseudoAuthenticationHandler

2019-05-09 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836605#comment-16836605
 ] 

Prabhu Joseph commented on YARN-9522:
-

Thanks [~giovanni.fumarola] for the review. Attached patch-002 with above 
change.

> AppBlock ignores full qualified class name of PseudoAuthenticationHandler
> -
>
> Key: YARN-9522
> URL: https://issues.apache.org/jira/browse/YARN-9522
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
> Attachments: YARN-9522-001.patch, YARN-9522-002.patch
>
>
> {{AuthenticationHandler}} can be either configured using fqcn or type. 
> {{AppBlock}} checks for only the type simple and ignores the fqcn of 
> {{PseudoAuthenticationHandler}} while checking whether ui is secured or not.
> {code}
>* @param authHandler The short-name (or fully qualified class name) of the
>* authentication handler.
> {code}
> *AppBlock.java*
> {code}
> // check if UI is unsecured.
> String httpAuth = 
> conf.get(CommonConfigurationKeys.HADOOP_HTTP_AUTHENTICATION_TYPE);
> this.unsecuredUI = (httpAuth != null) && httpAuth.equals("simple");
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9522) AppBlock ignores full qualified class name of PseudoAuthenticationHandler

2019-05-09 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9522:

Attachment: YARN-9522-002.patch

> AppBlock ignores full qualified class name of PseudoAuthenticationHandler
> -
>
> Key: YARN-9522
> URL: https://issues.apache.org/jira/browse/YARN-9522
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
> Attachments: YARN-9522-001.patch, YARN-9522-002.patch
>
>
> {{AuthenticationHandler}} can be either configured using fqcn or type. 
> {{AppBlock}} checks for only the type simple and ignores the fqcn of 
> {{PseudoAuthenticationHandler}} while checking whether ui is secured or not.
> {code}
>* @param authHandler The short-name (or fully qualified class name) of the
>* authentication handler.
> {code}
> *AppBlock.java*
> {code}
> // check if UI is unsecured.
> String httpAuth = 
> conf.get(CommonConfigurationKeys.HADOOP_HTTP_AUTHENTICATION_TYPE);
> this.unsecuredUI = (httpAuth != null) && httpAuth.equals("simple");
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9483) DistributedShell does not release container when failed to localize at launch

2019-05-09 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836602#comment-16836602
 ] 

Hudson commented on YARN-9483:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #16532 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16532/])
YARN-9483. DistributedShell does not release container when failed to (gifuma: 
rev ec361263464a903348bb80f23801094b4e0570d1)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java


> DistributedShell does not release container when failed to localize at launch
> -
>
> Key: YARN-9483
> URL: https://issues.apache.org/jira/browse/YARN-9483
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9483-001.patch
>
>
> DistributedShell does not release container when failed to localize at 
> launch. The launch threads does not increment completed & failed containers 
> when failed to localize. And the main thread waits for the containers to 
> complete without failing the job.
> {code}
> yarn jar 
> /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar
>  -shell_command ls  -shell_args / -jar  
> /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar
>  -localize_files /tmp/prabhu
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9522) AppBlock ignores full qualified class name of PseudoAuthenticationHandler

2019-05-09 Thread Giovanni Matteo Fumarola (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836595#comment-16836595
 ] 

Giovanni Matteo Fumarola commented on YARN-9522:


Thanks [~Prabhu Joseph]. 
Can you add an additional set of parenthesis to make the statements more 
readable?

e.g. ( (c1) && (c2)) || c3;

> AppBlock ignores full qualified class name of PseudoAuthenticationHandler
> -
>
> Key: YARN-9522
> URL: https://issues.apache.org/jira/browse/YARN-9522
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
> Attachments: YARN-9522-001.patch
>
>
> {{AuthenticationHandler}} can be either configured using fqcn or type. 
> {{AppBlock}} checks for only the type simple and ignores the fqcn of 
> {{PseudoAuthenticationHandler}} while checking whether ui is secured or not.
> {code}
>* @param authHandler The short-name (or fully qualified class name) of the
>* authentication handler.
> {code}
> *AppBlock.java*
> {code}
> // check if UI is unsecured.
> String httpAuth = 
> conf.get(CommonConfigurationKeys.HADOOP_HTTP_AUTHENTICATION_TYPE);
> this.unsecuredUI = (httpAuth != null) && httpAuth.equals("simple");
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9483) DistributedShell does not release container when failed to localize at launch

2019-05-09 Thread Giovanni Matteo Fumarola (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giovanni Matteo Fumarola updated YARN-9483:
---
Fix Version/s: 3.3.0

> DistributedShell does not release container when failed to localize at launch
> -
>
> Key: YARN-9483
> URL: https://issues.apache.org/jira/browse/YARN-9483
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9483-001.patch
>
>
> DistributedShell does not release container when failed to localize at 
> launch. The launch threads does not increment completed & failed containers 
> when failed to localize. And the main thread waits for the containers to 
> complete without failing the job.
> {code}
> yarn jar 
> /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar
>  -shell_command ls  -shell_args / -jar  
> /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar
>  -localize_files /tmp/prabhu
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9483) DistributedShell does not release container when failed to localize at launch

2019-05-09 Thread Giovanni Matteo Fumarola (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836594#comment-16836594
 ] 

Giovanni Matteo Fumarola commented on YARN-9483:


The patch looks good. Committed to trunk.
Thanks [~Prabhu Joseph] for the patch and [~pbacsko] for the initial review.

> DistributedShell does not release container when failed to localize at launch
> -
>
> Key: YARN-9483
> URL: https://issues.apache.org/jira/browse/YARN-9483
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-9483-001.patch
>
>
> DistributedShell does not release container when failed to localize at 
> launch. The launch threads does not increment completed & failed containers 
> when failed to localize. And the main thread waits for the containers to 
> complete without failing the job.
> {code}
> yarn jar 
> /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar
>  -shell_command ls  -shell_args / -jar  
> /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar
>  -localize_files /tmp/prabhu
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9482) DistributedShell job with localization fails in unsecure cluster

2019-05-09 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836569#comment-16836569
 ] 

Sunil Govindan commented on YARN-9482:
--

+ [~rohithsharma]

To me, this change seems good. If no other objections, lets get this in. 
[~rohithsharma] pls take a look if you have cycles. Thank you,.

> DistributedShell job with localization fails in unsecure cluster
> 
>
> Key: YARN-9482
> URL: https://issues.apache.org/jira/browse/YARN-9482
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: distributed-shell
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-9482-001.patch, YARN-9482-002.patch, 
> YARN-9482-003.patch
>
>
> DistributedShell job with localization fails in unsecure cluster. The client 
> localizes the input files to home directory (job user) whereas the AM runs as 
> yarn user reads from it's home directory.
> *Command:*
> {code}
> yarn jar 
> /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar
>  -shell_command ls  -shell_args / -jar  
> /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar
>  -localize_files /tmp/prabhu
> {code}
> {code}
> Exception in thread "Thread-4" java.io.UncheckedIOException: Error during 
> localization setup
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$LaunchContainerRunnable.lambda$run$0(ApplicationMaster.java:1495)
>   at 
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
>   at 
> java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$LaunchContainerRunnable.run(ApplicationMaster.java:1481)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.FileNotFoundException: File does not exist: 
> hdfs://yarn-ats-1:8020/user/yarn/DistributedShell/application_1554817981283_0003/prabhu
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1586)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1579)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1594)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$LaunchContainerRunnable.lambda$run$0(ApplicationMaster.java:1487)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9504) [UI2] Fair scheduler queue view page is broken

2019-05-09 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836566#comment-16836566
 ] 

Sunil Govindan commented on YARN-9504:
--

+1. Committing shortly

> [UI2] Fair scheduler queue view page is broken
> --
>
> Key: YARN-9504
> URL: https://issues.apache.org/jira/browse/YARN-9504
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, yarn-ui-v2
>Affects Versions: 3.2.0, 3.3.0, 3.2.1
>Reporter: Zoltan Siegl
>Assignee: Zoltan Siegl
>Priority: Major
> Fix For: 3.3.0, 3.2.1
>
> Attachments: Screenshot 2019-04-23 at 14.52.57.png, Screenshot 
> 2019-04-23 at 14.59.35.png, YARN-9504.001.patch, YARN-9504.002.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> UI2 queue page currently displays white screen for Fair Scheduler.
>  
> In src/main/webapp/app/components/tree-selector.js:377 (getUsedCapacity) code 
> refers to 
> queueData.get("partitionMap") which is null for fair scheduler queue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9483) DistributedShell does not release container when failed to localize at launch

2019-05-09 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836522#comment-16836522
 ] 

Prabhu Joseph commented on YARN-9483:
-

[~giovanni.fumarola] Can you review this jira when you get time. This fixes 
hanging of DS job when failed to localize at launch. Thanks.

> DistributedShell does not release container when failed to localize at launch
> -
>
> Key: YARN-9483
> URL: https://issues.apache.org/jira/browse/YARN-9483
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-9483-001.patch
>
>
> DistributedShell does not release container when failed to localize at 
> launch. The launch threads does not increment completed & failed containers 
> when failed to localize. And the main thread waits for the containers to 
> complete without failing the job.
> {code}
> yarn jar 
> /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar
>  -shell_command ls  -shell_args / -jar  
> /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar
>  -localize_files /tmp/prabhu
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9482) DistributedShell job with localization fails in unsecure cluster

2019-05-09 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836517#comment-16836517
 ] 

Prabhu Joseph commented on YARN-9482:
-

[~giovanni.fumarola] Can you review this jira when you get time. This fixes 
DistributedShell job localization failure in unsecure cluster. Thanks.

> DistributedShell job with localization fails in unsecure cluster
> 
>
> Key: YARN-9482
> URL: https://issues.apache.org/jira/browse/YARN-9482
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: distributed-shell
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-9482-001.patch, YARN-9482-002.patch, 
> YARN-9482-003.patch
>
>
> DistributedShell job with localization fails in unsecure cluster. The client 
> localizes the input files to home directory (job user) whereas the AM runs as 
> yarn user reads from it's home directory.
> *Command:*
> {code}
> yarn jar 
> /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar
>  -shell_command ls  -shell_args / -jar  
> /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar
>  -localize_files /tmp/prabhu
> {code}
> {code}
> Exception in thread "Thread-4" java.io.UncheckedIOException: Error during 
> localization setup
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$LaunchContainerRunnable.lambda$run$0(ApplicationMaster.java:1495)
>   at 
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
>   at 
> java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$LaunchContainerRunnable.run(ApplicationMaster.java:1481)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.FileNotFoundException: File does not exist: 
> hdfs://yarn-ats-1:8020/user/yarn/DistributedShell/application_1554817981283_0003/prabhu
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1586)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1579)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1594)
>   at 
> org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$LaunchContainerRunnable.lambda$run$0(ApplicationMaster.java:1487)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9527) Rogue LocalizerRunner/ContainerLocalizer repeatedly downloading same file

2019-05-09 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836478#comment-16836478
 ] 

Hadoop QA commented on YARN-9527:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 12s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 0 new + 190 unchanged - 25 fixed = 190 total (was 215) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 44s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 20m 
51s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 67m 59s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | YARN-9527 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12968307/YARN-9527.004.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux ba03302ebd88 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 90add05 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/24073/testReport/ |
| Max. process+thread count | 412 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/24073/console |
| Powered by

[jira] [Commented] (YARN-9527) Rogue LocalizerRunner/ContainerLocalizer repeatedly downloading same file

2019-05-09 Thread Jim Brennan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836456#comment-16836456
 ] 

Jim Brennan commented on YARN-9527:
---

Thanks for the review [~ebadger]!  I've put up another patch that adds the 
interrupt() call back in for the running containers case.  I'm not sure it's 
needed, but I think it's safer to keep that code path unchanged.

{quote}
Moving the getPathForLocalization() logic into findNextResource() makes a lot 
of sense so we don't have to go through the bad resources one heartbeat at a 
time and so we'll actually remove them from the pending list.
{quote}
Agreed.  It is possible that this change alone will minimize the window enough 
to prevent the problem by itself.   Instead of taking n seconds to process (and 
remove) n resources from the rogue container pending list, it will do it in one 
heartbeat, with far less opportunity for another container to start with the 
same resources.

{quote}
I'm not super wild about adding an LRU cache of 128 recent entries since it 
only makes the race less likely to occur instead of fixing it outright. 
However, this code is very complex and I can understand why you would want to 
make a minimally invasive change. I would like to hear other peoples' thoughts 
on this.
{quote}
The more bullet proof fix would be to change the LocalizerTracker.handle() 
function to look up the container state and only accept the request if the 
container was in the correct state.   Currently the LocalizerTracker doesn't 
access the container directly, so it would either have to lookup the container 
from the container id (which I'm not certain is set for all requests) or I 
would have to change the LocalizerContext to include the container directly.
I was concerned that this might be a performance hit (due to the synchronized 
containers list), since we would have to do this for every request from every 
container.

I admit that the LRU approach is not 100% bullet proof, but combined with the 
findNextResources change, I think it is sufficient to cover the very short 
window in which this problem can occur, and it limits the change to a small 
part of the code.  I am open to suggestions on how big it needs to be.

{quote}
It would also be good to prove that this fix actually works, and more 
importantly doesn't break anything else. So I think we should definitely wait 
for that until we put this in (if others agree with the approach)
{quote}
I think the unit test does show that the problem as I understand it is fixed 
(it fails with the old code and succeeds with the new), but I am also 
attempting to repro the failure manually, and will look into getting this fix 
deployed locally so we can test it on a larger cluster.

Thanks again for your feedback [~ebadger], it would be good to get some other 
eyes on this as well, given the complexity of the localization code.


> Rogue LocalizerRunner/ContainerLocalizer repeatedly downloading same file
> -
>
> Key: YARN-9527
> URL: https://issues.apache.org/jira/browse/YARN-9527
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.8.5, 3.1.2
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-9527.001.patch, YARN-9527.002.patch, 
> YARN-9527.003.patch, YARN-9527.004.patch
>
>
> A rogue ContainerLocalizer can get stuck in a loop continuously downloading 
> the same file while generating an "Invalid event: LOCALIZED at LOCALIZED" 
> exception on each iteration.  Sometimes this continues long enough that it 
> fills up a disk or depletes available inodes for the filesystem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9541) TestCombinedSystemMetricsPublisher fails intermittent

2019-05-09 Thread Prabhu Joseph (JIRA)
Prabhu Joseph created YARN-9541:
---

 Summary: TestCombinedSystemMetricsPublisher fails intermittent
 Key: YARN-9541
 URL: https://issues.apache.org/jira/browse/YARN-9541
 Project: Hadoop YARN
  Issue Type: Bug
  Components: ATSv2
Affects Versions: 3.2.0
Reporter: Prabhu Joseph
Assignee: Prabhu Joseph


org.apache.hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher.testTimelineServiceEventPublishingV1V2Enabled

{code}
Failing for the past 1 build (Since Failed#24071 )
Took 0.19 sec.
Error Message
java.net.BindException: Problem binding to [0.0.0.0:10200] 
java.net.BindException: Address already in use; For more details see:  
http://wiki.apache.org/hadoop/BindException
Stacktrace
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.BindException: 
Problem binding to [0.0.0.0:10200] java.net.BindException: Address already in 
use; For more details see:  http://wiki.apache.org/hadoop/BindException
at 
org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:139)
at 
org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getServer(HadoopYarnProtoRPC.java:66)
at org.apache.hadoop.yarn.ipc.YarnRPC.getServer(YarnRPC.java:55)
at 
org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryClientService.serviceStart(ApplicationHistoryClientService.java:94)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
at 
org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
at 
org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.serviceStart(ApplicationHistoryServer.java:120)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
at 
org.apache.hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher.testSetup(TestCombinedSystemMetricsPublisher.java:123)
at 
org.apache.hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher.runTest(TestCombinedSystemMetricsPublisher.java:242)
at 
org.apache.hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher.testTimelineServiceEventPublishingV1V2Enabled(TestCombinedSystemMetricsPublisher.java:252)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.BindException: Problem binding to [0.0.0.0:10200] 
java.net.BindException: Address already in use; For more details see:  
http://wiki.apache.org/hadoop/BindException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:833)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:738)
at org.apache.hadoop.ipc.Server.bind(Server.java:599)
at org.apache.hadoop.ipc.Server$Listener.(Server.java:1121)
at org.apache.hadoop.ipc.Server.(Server.java:2976)
at org.apache.hadoop.ipc.RPC$Server.(RPC.java:1039)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:427)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:347)
at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:848)
at 
org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.createServer(RpcServerFactoryPBImpl.java:173)
at 
org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:132)
... 22 more
Caused by: java.net.BindException: Address already in use
   

[jira] [Created] (YARN-9540) TestRMAppTransitions fails intermittently

2019-05-09 Thread Prabhu Joseph (JIRA)
Prabhu Joseph created YARN-9540:
---

 Summary: TestRMAppTransitions fails intermittently
 Key: YARN-9540
 URL: https://issues.apache.org/jira/browse/YARN-9540
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, test
Affects Versions: 3.2.0
Reporter: Prabhu Joseph
Assignee: Prabhu Joseph


Failed
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppFinishedFinished[0]

{code}
Error Message
expected:<1> but was:<0>
Stacktrace
java.lang.AssertionError: expected:<1> but was:<0>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:645)
at org.junit.Assert.assertEquals(Assert.java:631)
at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.verifyAppCompletedEvent(TestRMAppTransitions.java:1307)
at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.verifyAppAfterFinishEvent(TestRMAppTransitions.java:1302)
at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testCreateAppFinished(TestRMAppTransitions.java:648)
at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppFinishedFinished(TestRMAppTransitions.java:1083)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.junit.runners.Suite.runChild(Suite.java:128)
at org.junit.runners.Suite.runChild(Suite.java:27)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)

{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9527) Rogue LocalizerRunner/ContainerLocalizer repeatedly downloading same file

2019-05-09 Thread Jim Brennan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated YARN-9527:
--
Attachment: YARN-9527.004.patch

> Rogue LocalizerRunner/ContainerLocalizer repeatedly downloading same file
> -
>
> Key: YARN-9527
> URL: https://issues.apache.org/jira/browse/YARN-9527
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.8.5, 3.1.2
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-9527.001.patch, YARN-9527.002.patch, 
> YARN-9527.003.patch, YARN-9527.004.patch
>
>
> A rogue ContainerLocalizer can get stuck in a loop continuously downloading 
> the same file while generating an "Invalid event: LOCALIZED at LOCALIZED" 
> exception on each iteration.  Sometimes this continues long enough that it 
> fills up a disk or depletes available inodes for the filesystem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9489) Support filtering by request-priorities and allocation-request-ids for query results of app activities

2019-05-09 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836417#comment-16836417
 ] 

Hudson commented on YARN-9489:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #16530 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16530/])
YARN-9489. Support filtering by request-priorities and (wwei: rev 
90add05caa6c48659f0c592ec391b30f2a76069e)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/test/java/org/apache/hadoop/yarn/server/router/webapp/PassThroughRESTRequestInterceptor.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/main/java/org/apache/hadoop/yarn/server/router/webapp/FederationInterceptorREST.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/activities/ActivitiesManager.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/main/java/org/apache/hadoop/yarn/server/router/webapp/RouterWebServices.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/activities/AppAllocation.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWSConsts.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesSchedulerActivities.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/test/java/org/apache/hadoop/yarn/server/router/webapp/BaseRouterWebServicesTest.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/ActivitiesTestUtils.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServiceProtocol.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/main/java/org/apache/hadoop/yarn/server/router/webapp/DefaultRequestInterceptorREST.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/test/java/org/apache/hadoop/yarn/server/router/webapp/MockRESTRequestInterceptor.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java


> Support filtering by request-priorities and allocation-request-ids for query 
> results of app activities
> --
>
> Key: YARN-9489
> URL: https://issues.apache.org/jira/browse/YARN-9489
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9489.001.patch, YARN-9489.002.patch
>
>
> [Design Doc 
> #4.2|https://docs.google.com/document/d/1pwf-n3BCLW76bGrmNPM4T6pQ3vC4dVMcN2Ud1hq1t2M/edit#heading=h.m04tqsosk94h]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9489) Support filtering by request-priorities and allocation-request-ids for query results of app activities

2019-05-09 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836401#comment-16836401
 ] 

Weiwei Yang commented on YARN-9489:
---

Thanks for confirming it, +1. Committing shortly.

> Support filtering by request-priorities and allocation-request-ids for query 
> results of app activities
> --
>
> Key: YARN-9489
> URL: https://issues.apache.org/jira/browse/YARN-9489
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-9489.001.patch, YARN-9489.002.patch
>
>
> [Design Doc 
> #4.2|https://docs.google.com/document/d/1pwf-n3BCLW76bGrmNPM4T6pQ3vC4dVMcN2Ud1hq1t2M/edit#heading=h.m04tqsosk94h]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8622) NodeManager native build fails due to getgrouplist not found on macOS

2019-05-09 Thread Siyao Meng (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836393#comment-16836393
 ] 

Siyao Meng commented on YARN-8622:
--

[~eyang] I just added target branch 3.1. If you could commit this to branch-3.1 
as well? Since YARN-7221 is also in branch-3.1. I missed that target branch 
before. Thank you!

> NodeManager native build fails due to getgrouplist not found on macOS
> -
>
> Key: YARN-8622
> URL: https://issues.apache.org/jira/browse/YARN-8622
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.2.0, 3.3.0
> Environment: Darwin 17.7.0 Darwin Kernel Version 17.7.0: Thu Jun 21 
> 22:53:14 PDT 2018; root:xnu-4570.71.2~1/RELEASE_X86_64 x86_64
> Apple LLVM version 9.1.0 (clang-902.0.39.2)
>Reporter: Ewan Higgs
>Assignee: Siyao Meng
>Priority: Major
> Fix For: 3.3.0, 3.2.1
>
> Attachments: YARN-8622.001.patch, YARN-8622.002.patch
>
>
> Usage of getgrouplist() is added in YARN-7221 and should affect Hadoop 3.2.0 
> and later.
> Compiler:
> {code}
> $ /Library/Developer/CommandLineTools/usr/bin/c++ --version
> Apple LLVM version 9.1.0 (clang-902.0.39.2)
> Target: x86_64-apple-darwin17.7.0
> Thread model: posix
> InstalledDir: /Library/Developer/CommandLineTools/usr/bin
> {code}
> Build line:
> {code}
> [WARNING] /Library/Developer/CommandLineTools/usr/bin/c++   -g -O2 -Wall 
> -pthread -D_FILE_OFFSET_BITS=64 -Wl,-search_paths_first 
> -Wl,-headerpad_max_install_names   
> CMakeFiles/test-oom-listener.dir/main/native/oom-listener/impl/oom_listener.c.o
>  
> CMakeFiles/test-oom-listener.dir/main/native/oom-listener/test/oom_listener_test_main.cc.o
>   -o test/test-oom-listener libgtest.a -lrt 
> {code}
> Error message: 
> {code}
> ...
> [WARNING] 
> /Users/ehiggs/src/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/utils/docker-util.c:1264:12:
>  error: no matching function for call to 'getgrouplist'
> [WARNING]   int rc = getgrouplist(user, pw->pw_gid, groups, &ngroups);
> [WARNING]^~~~
> [WARNING] /usr/include/unistd.h:653:6: note: candidate function not viable: 
> no known conversion from 'gid_t *' (aka 'unsigned int *') to 'int *' for 3rd 
> argument
> [WARNING] int  getgrouplist(const char *, int, int *, int *);
> [WARNING]  ^
> [WARNING] In file included from 
> /Users/ehiggs/src/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/test/utils/test_docker_util.cc:24:
> [WARNING] 
> /Users/ehiggs/src/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/utils/docker-util.c:1271:9:
>  error: no matching function for call to 'getgrouplist'
> [WARNING] if (getgrouplist(user, pw->pw_gid, groups, &ngroups) == -1) {
> [WARNING] ^~~~
> [WARNING] /usr/include/unistd.h:653:6: note: candidate function not viable: 
> no known conversion from 'gid_t *' (aka 'unsigned int *') to 'int *' for 3rd 
> argument
> [WARNING] int  getgrouplist(const char *, int, int *, int *);
> [WARNING]  ^
> [WARNING] 2 warnings and 2 errors generated.
> [WARNING] make[2]: *** 
> [CMakeFiles/cetest.dir/main/native/container-executor/test/utils/test_docker_util.cc.o]
>  Error 1
> [WARNING] make[1]: *** [CMakeFiles/cetest.dir/all] Error 2
> [WARNING] make: *** [all] Error 2
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-9508) YarnConfiguration areNodeLabel enabled is costly in allocation flow

2019-05-09 Thread Bilwa S T (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836385#comment-16836385
 ] 

Bilwa S T edited comment on YARN-9508 at 5/9/19 1:39 PM:
-

Hi [~bibinchundatt] . Thanks for reviewing  

1. Below CheckStyle issue can't be handled:
{quote}public static void normalizeAndValidateRequest(ResourceRequest 
resReq,:22: More than 7 parameters (found 8). [ParameterNumber]
{quote}
. 
 2. Test Case Failures are random .Not related to my changes

 


was (Author: bilwast):
Hi [~bibinchundatt]        

> YarnConfiguration areNodeLabel enabled is costly in allocation flow
> ---
>
> Key: YARN-9508
> URL: https://issues.apache.org/jira/browse/YARN-9508
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bilwa S T
>Priority: Critical
> Attachments: YARN-9508-001.patch, YARN-9508-002.patch
>
>
> For every allocate request locking can be avoided. Improving performance
> {noformat}
> "pool-6-thread-300" #624 prio=5 os_prio=0 tid=0x7f2f91152800 nid=0x8ec5 
> waiting for monitor entry [0x7f1ec6a8d000]
>  java.lang.Thread.State: BLOCKED (on object monitor)
>  at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2841)
>  - waiting to lock <0x7f1f8107c748> (a 
> org.apache.hadoop.yarn.conf.YarnConfiguration)
>  at org.apache.hadoop.conf.Configuration.get(Configuration.java:1214)
>  at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:1268)
>  at org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:1674)
>  at 
> org.apache.hadoop.yarn.conf.YarnConfiguration.areNodeLabelsEnabled(YarnConfiguration.java:3646)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:234)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndvalidateRequest(SchedulerUtils.java:274)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.normalizeAndValidateRequests(RMServerUtils.java:261)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.DefaultAMSProcessor.allocate(DefaultAMSProcessor.java:242)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.processor.DisabledPlacementProcessor.allocate(DisabledPlacementProcessor.java:75)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.AMSProcessingChain.allocate(AMSProcessingChain.java:92)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:427)
>  - locked <0x7f24dd3f9e40> (a 
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService$AllocateResponseLock)
>  at 
> org.apache.hadoop.yarn.sls.appmaster.MRAMSimulator$1.run(MRAMSimulator.java:352)
>  at 
> org.apache.hadoop.yarn.sls.appmaster.MRAMSimulator$1.run(MRAMSimulator.java:349)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
>  at 
> org.apache.hadoop.yarn.sls.appmaster.MRAMSimulator.sendContainerRequest(MRAMSimulator.java:348)
>  at 
> org.apache.hadoop.yarn.sls.appmaster.AMSimulator.middleStep(AMSimulator.java:212)
>  at 
> org.apache.hadoop.yarn.sls.scheduler.TaskRunner$Task.run(TaskRunner.java:94)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9508) YarnConfiguration areNodeLabel enabled is costly in allocation flow

2019-05-09 Thread Bilwa S T (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836385#comment-16836385
 ] 

Bilwa S T commented on YARN-9508:
-

Hi [~bibinchundatt]        

> YarnConfiguration areNodeLabel enabled is costly in allocation flow
> ---
>
> Key: YARN-9508
> URL: https://issues.apache.org/jira/browse/YARN-9508
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bilwa S T
>Priority: Critical
> Attachments: YARN-9508-001.patch, YARN-9508-002.patch
>
>
> For every allocate request locking can be avoided. Improving performance
> {noformat}
> "pool-6-thread-300" #624 prio=5 os_prio=0 tid=0x7f2f91152800 nid=0x8ec5 
> waiting for monitor entry [0x7f1ec6a8d000]
>  java.lang.Thread.State: BLOCKED (on object monitor)
>  at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2841)
>  - waiting to lock <0x7f1f8107c748> (a 
> org.apache.hadoop.yarn.conf.YarnConfiguration)
>  at org.apache.hadoop.conf.Configuration.get(Configuration.java:1214)
>  at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:1268)
>  at org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:1674)
>  at 
> org.apache.hadoop.yarn.conf.YarnConfiguration.areNodeLabelsEnabled(YarnConfiguration.java:3646)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:234)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndvalidateRequest(SchedulerUtils.java:274)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.normalizeAndValidateRequests(RMServerUtils.java:261)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.DefaultAMSProcessor.allocate(DefaultAMSProcessor.java:242)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.processor.DisabledPlacementProcessor.allocate(DisabledPlacementProcessor.java:75)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.AMSProcessingChain.allocate(AMSProcessingChain.java:92)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:427)
>  - locked <0x7f24dd3f9e40> (a 
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService$AllocateResponseLock)
>  at 
> org.apache.hadoop.yarn.sls.appmaster.MRAMSimulator$1.run(MRAMSimulator.java:352)
>  at 
> org.apache.hadoop.yarn.sls.appmaster.MRAMSimulator$1.run(MRAMSimulator.java:349)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
>  at 
> org.apache.hadoop.yarn.sls.appmaster.MRAMSimulator.sendContainerRequest(MRAMSimulator.java:348)
>  at 
> org.apache.hadoop.yarn.sls.appmaster.AMSimulator.middleStep(AMSimulator.java:212)
>  at 
> org.apache.hadoop.yarn.sls.scheduler.TaskRunner$Task.run(TaskRunner.java:94)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9508) YarnConfiguration areNodeLabel enabled is costly in allocation flow

2019-05-09 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836382#comment-16836382
 ] 

Hadoop QA commented on YARN-9508:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 10s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 25s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 1 new + 135 unchanged - 0 fixed = 136 total (was 135) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 12s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 80m 59s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}129m  7s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher |
|   | hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | YARN-9508 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12968291/YARN-9508-002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux eb11e171cc8e 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 2595125 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/24071/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| unit | 
https://builds.apache.org/jo

[jira] [Commented] (YARN-9522) AppBlock ignores full qualified class name of PseudoAuthenticationHandler

2019-05-09 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836365#comment-16836365
 ] 

Prabhu Joseph commented on YARN-9522:
-

[~giovanni.fumarola] Can you review this jira when you get time. Thanks.

> AppBlock ignores full qualified class name of PseudoAuthenticationHandler
> -
>
> Key: YARN-9522
> URL: https://issues.apache.org/jira/browse/YARN-9522
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
> Attachments: YARN-9522-001.patch
>
>
> {{AuthenticationHandler}} can be either configured using fqcn or type. 
> {{AppBlock}} checks for only the type simple and ignores the fqcn of 
> {{PseudoAuthenticationHandler}} while checking whether ui is secured or not.
> {code}
>* @param authHandler The short-name (or fully qualified class name) of the
>* authentication handler.
> {code}
> *AppBlock.java*
> {code}
> // check if UI is unsecured.
> String httpAuth = 
> conf.get(CommonConfigurationKeys.HADOOP_HTTP_AUTHENTICATION_TYPE);
> this.unsecuredUI = (httpAuth != null) && httpAuth.equals("simple");
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9522) AppBlock ignores full qualified class name of PseudoAuthenticationHandler

2019-05-09 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836325#comment-16836325
 ] 

Hadoop QA commented on YARN-9522:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 53s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 12s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
38s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 48m 10s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | YARN-9522 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12968292/YARN-9522-001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 29d76e063364 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 2595125 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/24072/testReport/ |
| Max. process+thread count | 446 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/24072/console |
| Powered by | Apache Yetus 0.8.0   http://yetu

[jira] [Commented] (YARN-9489) Support filtering by request-priorities and allocation-request-ids for query results of app activities

2019-05-09 Thread Tao Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836312#comment-16836312
 ] 

Tao Yang commented on YARN-9489:


Thanks [~cheersyang] for the review.

{quote}

{{getAppActivities()}} was to add two new query parameters, that means we are 
not breaking old APIs correct?

{quote}

Yes, this change will enhance old APIs instead of breaking them.

{quote}

Another thing is please create another Jira to add some doc about the 
{{app-activities}} restful API in RM rest doc.

{quote}

Sure, I will fulfill the document and REST APIs in YARN-9538.

> Support filtering by request-priorities and allocation-request-ids for query 
> results of app activities
> --
>
> Key: YARN-9489
> URL: https://issues.apache.org/jira/browse/YARN-9489
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-9489.001.patch, YARN-9489.002.patch
>
>
> [Design Doc 
> #4.2|https://docs.google.com/document/d/1pwf-n3BCLW76bGrmNPM4T6pQ3vC4dVMcN2Ud1hq1t2M/edit#heading=h.m04tqsosk94h]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9539) Improve cleanup process of app activities and make some conditions configurable

2019-05-09 Thread Tao Yang (JIRA)
Tao Yang created YARN-9539:
--

 Summary: Improve cleanup process of app activities and make some 
conditions configurable
 Key: YARN-9539
 URL: https://issues.apache.org/jira/browse/YARN-9539
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler
 Environment: [YARN-9050 Design doc 
#4.4|https://docs.google.com/document/d/1pwf-n3BCLW76bGrmNPM4T6pQ3vC4dVMcN2Ud1hq1t2M/edit#heading=h.crdawajmm3a4]
Reporter: Tao Yang
Assignee: Tao Yang






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9538) Document scheduler/app activities and REST APIs

2019-05-09 Thread Tao Yang (JIRA)
Tao Yang created YARN-9538:
--

 Summary: Document scheduler/app activities and REST APIs
 Key: YARN-9538
 URL: https://issues.apache.org/jira/browse/YARN-9538
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: documentation
Reporter: Tao Yang
Assignee: Tao Yang


Add documentation for scheduler/app activities in CapacityScheduler.md and 
ResourceManagerRest.md.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9522) AppBlock ignores full qualified class name of PseudoAuthenticationHandler

2019-05-09 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836292#comment-16836292
 ] 

Prabhu Joseph commented on YARN-9522:
-

When hadoop.http.authentication.type is set to simple, yarn ui is considered 
unsecured and "Kill Button" is not displayed whereas displayed with 
org.apache.hadoop.security.authentication.server.PseudoAuthenticationHandler 
Have tested the fix and works fine.

> AppBlock ignores full qualified class name of PseudoAuthenticationHandler
> -
>
> Key: YARN-9522
> URL: https://issues.apache.org/jira/browse/YARN-9522
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
> Attachments: YARN-9522-001.patch
>
>
> {{AuthenticationHandler}} can be either configured using fqcn or type. 
> {{AppBlock}} checks for only the type simple and ignores the fqcn of 
> {{PseudoAuthenticationHandler}} while checking whether ui is secured or not.
> {code}
>* @param authHandler The short-name (or fully qualified class name) of the
>* authentication handler.
> {code}
> *AppBlock.java*
> {code}
> // check if UI is unsecured.
> String httpAuth = 
> conf.get(CommonConfigurationKeys.HADOOP_HTTP_AUTHENTICATION_TYPE);
> this.unsecuredUI = (httpAuth != null) && httpAuth.equals("simple");
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9522) AppBlock ignores full qualified class name of PseudoAuthenticationHandler

2019-05-09 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9522:

Attachment: YARN-9522-001.patch

> AppBlock ignores full qualified class name of PseudoAuthenticationHandler
> -
>
> Key: YARN-9522
> URL: https://issues.apache.org/jira/browse/YARN-9522
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Minor
> Attachments: YARN-9522-001.patch
>
>
> {{AuthenticationHandler}} can be either configured using fqcn or type. 
> {{AppBlock}} checks for only the type simple and ignores the fqcn of 
> {{PseudoAuthenticationHandler}} while checking whether ui is secured or not.
> {code}
>* @param authHandler The short-name (or fully qualified class name) of the
>* authentication handler.
> {code}
> *AppBlock.java*
> {code}
> // check if UI is unsecured.
> String httpAuth = 
> conf.get(CommonConfigurationKeys.HADOOP_HTTP_AUTHENTICATION_TYPE);
> this.unsecuredUI = (httpAuth != null) && httpAuth.equals("simple");
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9508) YarnConfiguration areNodeLabel enabled is costly in allocation flow

2019-05-09 Thread Bilwa S T (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-9508:

Attachment: YARN-9508-002.patch

> YarnConfiguration areNodeLabel enabled is costly in allocation flow
> ---
>
> Key: YARN-9508
> URL: https://issues.apache.org/jira/browse/YARN-9508
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bilwa S T
>Priority: Critical
> Attachments: YARN-9508-001.patch, YARN-9508-002.patch
>
>
> For every allocate request locking can be avoided. Improving performance
> {noformat}
> "pool-6-thread-300" #624 prio=5 os_prio=0 tid=0x7f2f91152800 nid=0x8ec5 
> waiting for monitor entry [0x7f1ec6a8d000]
>  java.lang.Thread.State: BLOCKED (on object monitor)
>  at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2841)
>  - waiting to lock <0x7f1f8107c748> (a 
> org.apache.hadoop.yarn.conf.YarnConfiguration)
>  at org.apache.hadoop.conf.Configuration.get(Configuration.java:1214)
>  at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:1268)
>  at org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:1674)
>  at 
> org.apache.hadoop.yarn.conf.YarnConfiguration.areNodeLabelsEnabled(YarnConfiguration.java:3646)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:234)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndvalidateRequest(SchedulerUtils.java:274)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.normalizeAndValidateRequests(RMServerUtils.java:261)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.DefaultAMSProcessor.allocate(DefaultAMSProcessor.java:242)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.processor.DisabledPlacementProcessor.allocate(DisabledPlacementProcessor.java:75)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.AMSProcessingChain.allocate(AMSProcessingChain.java:92)
>  at 
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:427)
>  - locked <0x7f24dd3f9e40> (a 
> org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService$AllocateResponseLock)
>  at 
> org.apache.hadoop.yarn.sls.appmaster.MRAMSimulator$1.run(MRAMSimulator.java:352)
>  at 
> org.apache.hadoop.yarn.sls.appmaster.MRAMSimulator$1.run(MRAMSimulator.java:349)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
>  at 
> org.apache.hadoop.yarn.sls.appmaster.MRAMSimulator.sendContainerRequest(MRAMSimulator.java:348)
>  at 
> org.apache.hadoop.yarn.sls.appmaster.AMSimulator.middleStep(AMSimulator.java:212)
>  at 
> org.apache.hadoop.yarn.sls.scheduler.TaskRunner$Task.run(TaskRunner.java:94)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-9537) Add configuration to support AM preemption

2019-05-09 Thread Liu Shaohui (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836256#comment-16836256
 ] 

Liu Shaohui edited comment on YARN-9537 at 5/9/19 10:04 AM:


support -> disable?


was (Author: liushaohui):
h1. support -> disable

> Add configuration to support AM preemption
> --
>
> Key: YARN-9537
> URL: https://issues.apache.org/jira/browse/YARN-9537
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Reporter: zhoukang
>Priority: Major
>
> In our production cluster, we can tolerate am preemption. So we can add a 
> configuration to support am preemption.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9537) Add configuration to support AM preemption

2019-05-09 Thread Liu Shaohui (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836256#comment-16836256
 ] 

Liu Shaohui commented on YARN-9537:
---

h1. support -> disable

> Add configuration to support AM preemption
> --
>
> Key: YARN-9537
> URL: https://issues.apache.org/jira/browse/YARN-9537
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Reporter: zhoukang
>Priority: Major
>
> In our production cluster, we can tolerate am preemption. So we can add a 
> configuration to support am preemption.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9519) TFile log aggregation file format is insensitive to the yarn.log-aggregation.TFile.remote-app-log-dir config

2019-05-09 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836239#comment-16836239
 ] 

Sunil Govindan commented on YARN-9519:
--

@[sumasai.shivapra...@gmail.com|mailto:sumasai.shivapra...@gmail.com]

Cud u pls take a look

> TFile log aggregation file format is insensitive to the 
> yarn.log-aggregation.TFile.remote-app-log-dir config
> 
>
> Key: YARN-9519
> URL: https://issues.apache.org/jira/browse/YARN-9519
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: log-aggregation
>Affects Versions: 3.2.0
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Attachments: YARN-9519.001.patch, YARN-9519.002.patch, 
> YARN-9519.003.patch, YARN-9519.004.patch, YARN-9519.005.patch
>
>
> The TFile log aggregation file format is not sensitive to the 
> yarn.log-aggregation.TFile.remote-app-log-dir config.
> In {{LogAggregationTFileController$initInternal}}:
> {code:java}
> this.remoteRootLogDir = new Path(
> conf.get(YarnConfiguration.NM_REMOTE_APP_LOG_DIR,
> YarnConfiguration.DEFAULT_NM_REMOTE_APP_LOG_DIR));
> {code}
> So the remoteRootLogDir is only aware of the 
> yarn.nodemanager.remote-app-log-dir config, while other file format, like 
> IFile defaults to the file format config, so its priority is higher.
> From {{LogAggregationIndexedFileController$initInternal}}:
> {code:java}
> String remoteDirStr = String.format(
> YarnConfiguration.LOG_AGGREGATION_REMOTE_APP_LOG_DIR_FMT,
> this.fileControllerName);
> String remoteDir = conf.get(remoteDirStr);
> if (remoteDir == null || remoteDir.isEmpty()) {
>   remoteDir = conf.get(YarnConfiguration.NM_REMOTE_APP_LOG_DIR,
>   YarnConfiguration.DEFAULT_NM_REMOTE_APP_LOG_DIR);
> }
> {code}
> (Where these configs are: )
> {code:java}
> public static final String LOG_AGGREGATION_REMOTE_APP_LOG_DIR_FMT
>   = YARN_PREFIX + "log-aggregation.%s.remote-app-log-dir";
> public static final String NM_REMOTE_APP_LOG_DIR = 
> NM_PREFIX + "remote-app-log-dir";
> {code}
> I suggest TFile should try to obtain the remote dir config from 
> yarn.log-aggregation.TFile.remote-app-log-dir first, and only if that is not 
> specified falls back to the yarn.nodemanager.remote-app-log-dir config.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9537) Add configuration to support AM preemption

2019-05-09 Thread zhoukang (JIRA)
zhoukang created YARN-9537:
--

 Summary: Add configuration to support AM preemption
 Key: YARN-9537
 URL: https://issues.apache.org/jira/browse/YARN-9537
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: fairscheduler
Reporter: zhoukang


In our production cluster, we can tolerate am preemption. So we can add a 
configuration to support am preemption.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9489) Support filtering by request-priorities and allocation-request-ids for query results of app activities

2019-05-09 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836208#comment-16836208
 ] 

Weiwei Yang commented on YARN-9489:
---

Hi [~Tao Yang]

Most changes seem good to me. Only one thing in \{{RMWebServiceProtocol}}, the 
changes to \{{#getAppActivities()}} was to add two new query parameters, that 
means we are not breaking old APIs correct? Just want to confirm.

Another thing is please create a another Jira to add some doc about the 
{{app-activities}} restful API in RM rest doc.  Currently it is missing.

Thanks

> Support filtering by request-priorities and allocation-request-ids for query 
> results of app activities
> --
>
> Key: YARN-9489
> URL: https://issues.apache.org/jira/browse/YARN-9489
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-9489.001.patch, YARN-9489.002.patch
>
>
> [Design Doc 
> #4.2|https://docs.google.com/document/d/1pwf-n3BCLW76bGrmNPM4T6pQ3vC4dVMcN2Ud1hq1t2M/edit#heading=h.m04tqsosk94h]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org