[jira] [Updated] (YARN-9497) Support grouping by diagnostics for query results of scheduler and app activities
[ https://issues.apache.org/jira/browse/YARN-9497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang updated YARN-9497: --- Attachment: YARN-9497.001.patch > Support grouping by diagnostics for query results of scheduler and app > activities > - > > Key: YARN-9497 > URL: https://issues.apache.org/jira/browse/YARN-9497 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-9497.001.patch > > > [Design Doc > #4.3|https://docs.google.com/document/d/1pwf-n3BCLW76bGrmNPM4T6pQ3vC4dVMcN2Ud1hq1t2M/edit#heading=h.6fbpge17dmmr] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9522) AppBlock ignores full qualified class name of PseudoAuthenticationHandler
[ https://issues.apache.org/jira/browse/YARN-9522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836705#comment-16836705 ] Hudson commented on YARN-9522: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #16535 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/16535/]) YARN-9522. AppBlock ignores full qualified class name of (gifuma: rev 1b48100a5e5c6a08b91a9283bc2dbb7725e3236d) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/webapp/AppBlock.java > AppBlock ignores full qualified class name of PseudoAuthenticationHandler > - > > Key: YARN-9522 > URL: https://issues.apache.org/jira/browse/YARN-9522 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.2.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Minor > Fix For: 3.3.0 > > Attachments: YARN-9522-001.patch, YARN-9522-002.patch > > > {{AuthenticationHandler}} can be either configured using fqcn or type. > {{AppBlock}} checks for only the type simple and ignores the fqcn of > {{PseudoAuthenticationHandler}} while checking whether ui is secured or not. > {code} >* @param authHandler The short-name (or fully qualified class name) of the >* authentication handler. > {code} > *AppBlock.java* > {code} > // check if UI is unsecured. > String httpAuth = > conf.get(CommonConfigurationKeys.HADOOP_HTTP_AUTHENTICATION_TYPE); > this.unsecuredUI = (httpAuth != null) && httpAuth.equals("simple"); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9522) AppBlock ignores full qualified class name of PseudoAuthenticationHandler
[ https://issues.apache.org/jira/browse/YARN-9522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola updated YARN-9522: --- Fix Version/s: 3.3.0 > AppBlock ignores full qualified class name of PseudoAuthenticationHandler > - > > Key: YARN-9522 > URL: https://issues.apache.org/jira/browse/YARN-9522 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.2.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Minor > Fix For: 3.3.0 > > Attachments: YARN-9522-001.patch, YARN-9522-002.patch > > > {{AuthenticationHandler}} can be either configured using fqcn or type. > {{AppBlock}} checks for only the type simple and ignores the fqcn of > {{PseudoAuthenticationHandler}} while checking whether ui is secured or not. > {code} >* @param authHandler The short-name (or fully qualified class name) of the >* authentication handler. > {code} > *AppBlock.java* > {code} > // check if UI is unsecured. > String httpAuth = > conf.get(CommonConfigurationKeys.HADOOP_HTTP_AUTHENTICATION_TYPE); > this.unsecuredUI = (httpAuth != null) && httpAuth.equals("simple"); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9522) AppBlock ignores full qualified class name of PseudoAuthenticationHandler
[ https://issues.apache.org/jira/browse/YARN-9522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836701#comment-16836701 ] Giovanni Matteo Fumarola commented on YARN-9522: My bad. [^YARN-9522-001.patch] was ok. Committed v1 to trunk. Thanks [~Prabhu Joseph]. > AppBlock ignores full qualified class name of PseudoAuthenticationHandler > - > > Key: YARN-9522 > URL: https://issues.apache.org/jira/browse/YARN-9522 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.2.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Minor > Fix For: 3.3.0 > > Attachments: YARN-9522-001.patch, YARN-9522-002.patch > > > {{AuthenticationHandler}} can be either configured using fqcn or type. > {{AppBlock}} checks for only the type simple and ignores the fqcn of > {{PseudoAuthenticationHandler}} while checking whether ui is secured or not. > {code} >* @param authHandler The short-name (or fully qualified class name) of the >* authentication handler. > {code} > *AppBlock.java* > {code} > // check if UI is unsecured. > String httpAuth = > conf.get(CommonConfigurationKeys.HADOOP_HTTP_AUTHENTICATION_TYPE); > this.unsecuredUI = (httpAuth != null) && httpAuth.equals("simple"); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9518) can't use CGroups with YARN in centos7
[ https://issues.apache.org/jira/browse/YARN-9518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836669#comment-16836669 ] Jim Brennan commented on YARN-9518: --- [~shurong.mai], i believe that [~jhung] is correct, and this problem was fixed in -YARN-2194.- The question is, what version of Hadoop are you running? This should not be a problem in any release after 2.8. The variable LINUX_PATH_SEPARATOR (which is {{%}}) is now used as a separator instead of comma, so the comma in the path {{/sys/fs/cgroup/cpu,cpuacct}} is handled properly as part of the filename. If you are running Hadoop version 2.7, then the proper fix would be to pull YARN-2194 back to that branch (if applicable). If you are running 2.8 or later, then is it possible that you have an old container-executor binary? That's the only way I think you would still be seeing this. We have been running on RHEL7 for over a year, which has the same cgroups as centos7, and we are not seeing this problem. > can't use CGroups with YARN in centos7 > --- > > Key: YARN-9518 > URL: https://issues.apache.org/jira/browse/YARN-9518 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.2.0, 2.9.2, 2.8.5, 2.7.7, 3.1.2 >Reporter: Shurong Mai >Priority: Major > Labels: cgroup, patch > Attachments: YARN-9518-branch-2.7.7.001.patch, > YARN-9518-trunk.001.patch, YARN-9518.patch > > > The os version is centos7. > {code:java} > cat /etc/redhat-release > CentOS Linux release 7.3.1611 (Core) > {code} > When I had set configuration variables for cgroup with yarn, nodemanager > could be start without any matter. But when I ran a job, the job failed with > these exceptional nodemanager logs in the end. > In these logs, the important logs is " Can't open file /sys/fs/cgroup/cpu as > node manager - Is a directory " > After I analysed, I found the reason. In centos6, the cgroup "cpu" and > "cpuacct" subsystem are as follows: > {code:java} > /sys/fs/cgroup/cpu > /sys/fs/cgroup/cpuacct > {code} > But in centos7, as follows: > {code:java} > /sys/fs/cgroup/cpu -> cpu,cpuacct > /sys/fs/cgroup/cpuacct -> cpu,cpuacct > /sys/fs/cgroup/cpu,cpuacct{code} > "cpu" and "cpuacct" have merge as "cpu,cpuacct". "cpu" and "cpuacct" are > symbol links. > As I look at source code, nodemamager get the cgroup subsystem info by > reading /proc/mounts. So It get the cpu and cpuacct subsystem path are also > "/sys/fs/cgroup/cpu,cpuacct". > The resource description arguments of container-executor is such as follows: > {code:java} > cgroups=/sys/fs/cgroup/cpu,cpuacct/hadoop-yarn/container_1554210318404_0057_02_01/tasks > {code} > There is a comma in the cgroup path, but the comma is separator of multi > resource. Therefore, the cgroup path is truncated by container-executor as > "/sys/fs/cgroup/cpu" rather than correct cgroup path " > /sys/fs/cgroup/cpu,cpuacct/hadoop-yarn/container_1554210318404_0057_02_01/tasks > " and report the error in the log " Can't open file /sys/fs/cgroup/cpu as > node manager - Is a directory " > Hence I modify the source code and submit a patch. The idea of patch is that > nodemanager get the cgroup cpu path as "/sys/fs/cgroup/cpu" rather than > "/sys/fs/cgroup/cpu,cpuacct". As a result, the resource description > arguments of container-executor is such as follows: > {code:java} > cgroups=/sys/fs/cgroup/cpu/hadoop-yarn/container_1554210318404_0057_02_01/tasks > {code} > Note that there is no comma in the path, and is a valid path because > "/sys/fs/cgroup/cpu" is symbol link to "/sys/fs/cgroup/cpu,cpuacct". > After applied the patch, the problem is resolved and the job can run > successfully. > The patch is compatible with cgroup path of history os version such as > centos6, centos7 , and universally applicable to cgroup subsystem paths such > as cgroup network subsystem as follows: > {code:java} > /sys/fs/cgroup/net_cls -> net_cls,net_prio > /sys/fs/cgroup/net_prio -> net_cls,net_prio > /sys/fs/cgroup/net_cls,net_prio{code} > > > ## > {panel:title=exceptional nodemanager logs:} > 2019-04-19 20:17:20,095 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: > Container container_1554210318404_0042_01_01 transitioned from LOCALIZED > to RUNNING > 2019-04-19 20:17:20,101 WARN > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Exit code > from container container_1554210318404_0042_01_01 is : 27 > 2019-04-19 20:17:20,103 WARN > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Exception > from container-launch with container ID: container_155421031840 > 4_0042_01_00
[jira] [Commented] (YARN-9522) AppBlock ignores full qualified class name of PseudoAuthenticationHandler
[ https://issues.apache.org/jira/browse/YARN-9522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836639#comment-16836639 ] Hadoop QA commented on YARN-9522: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 56s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 36s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 43s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 28s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 50m 11s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e | | JIRA Issue | YARN-9522 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12968314/YARN-9522-002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux d93cc7b48257 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 2d31ccc | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_191 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/24074/testReport/ | | Max. process+thread count | 412 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/24074/console | | Powered by | Apache Yetus 0.8.0 http://yetu
[jira] [Commented] (YARN-9527) Rogue LocalizerRunner/ContainerLocalizer repeatedly downloading same file
[ https://issues.apache.org/jira/browse/YARN-9527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836638#comment-16836638 ] Jim Brennan commented on YARN-9527: --- I was able to repro the problem in branch-2.8 on a one-node-cluster by changing ApplicationImpl.AppInitDoneTransition() to immediately send a ContainerKillEvent event after first ContainerInitEvent is sent. So it's a one-time shot for the NM. I restart the nodemanager with this change, and then run a sleep job with a list of files to localize. {noformat} hadoop jar $HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-*-tests.jar sleep -files file1,file2,file3,file4,file5,file6,file7,file8,file9,file10,file11,file12,file13,file14,file15,file16,file17 -m 10 -r 10 -mt 1 -rt 1 {noformat} Without my fix, this causes a rogue ContainerLocalizer to get stuck in the LOCALIZED at LOCALIZED loop every time. I have verified that my fix prevents this. I have also verified that the fix without the LRUCache portion (just the findNextResource change) does not fix the problem (at least for this test case). > Rogue LocalizerRunner/ContainerLocalizer repeatedly downloading same file > - > > Key: YARN-9527 > URL: https://issues.apache.org/jira/browse/YARN-9527 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.8.5, 3.1.2 >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Attachments: YARN-9527.001.patch, YARN-9527.002.patch, > YARN-9527.003.patch, YARN-9527.004.patch > > > A rogue ContainerLocalizer can get stuck in a loop continuously downloading > the same file while generating an "Invalid event: LOCALIZED at LOCALIZED" > exception on each iteration. Sometimes this continues long enough that it > fills up a disk or depletes available inodes for the filesystem. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8622) NodeManager native build fails due to getgrouplist not found on macOS
[ https://issues.apache.org/jira/browse/YARN-8622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated YARN-8622: Fix Version/s: 3.1.3 > NodeManager native build fails due to getgrouplist not found on macOS > - > > Key: YARN-8622 > URL: https://issues.apache.org/jira/browse/YARN-8622 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.2.0, 3.3.0 > Environment: Darwin 17.7.0 Darwin Kernel Version 17.7.0: Thu Jun 21 > 22:53:14 PDT 2018; root:xnu-4570.71.2~1/RELEASE_X86_64 x86_64 > Apple LLVM version 9.1.0 (clang-902.0.39.2) >Reporter: Ewan Higgs >Assignee: Siyao Meng >Priority: Major > Fix For: 3.3.0, 3.2.1, 3.1.3 > > Attachments: YARN-8622.001.patch, YARN-8622.002.patch > > > Usage of getgrouplist() is added in YARN-7221 and should affect Hadoop 3.2.0 > and later. > Compiler: > {code} > $ /Library/Developer/CommandLineTools/usr/bin/c++ --version > Apple LLVM version 9.1.0 (clang-902.0.39.2) > Target: x86_64-apple-darwin17.7.0 > Thread model: posix > InstalledDir: /Library/Developer/CommandLineTools/usr/bin > {code} > Build line: > {code} > [WARNING] /Library/Developer/CommandLineTools/usr/bin/c++ -g -O2 -Wall > -pthread -D_FILE_OFFSET_BITS=64 -Wl,-search_paths_first > -Wl,-headerpad_max_install_names > CMakeFiles/test-oom-listener.dir/main/native/oom-listener/impl/oom_listener.c.o > > CMakeFiles/test-oom-listener.dir/main/native/oom-listener/test/oom_listener_test_main.cc.o > -o test/test-oom-listener libgtest.a -lrt > {code} > Error message: > {code} > ... > [WARNING] > /Users/ehiggs/src/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/utils/docker-util.c:1264:12: > error: no matching function for call to 'getgrouplist' > [WARNING] int rc = getgrouplist(user, pw->pw_gid, groups, &ngroups); > [WARNING]^~~~ > [WARNING] /usr/include/unistd.h:653:6: note: candidate function not viable: > no known conversion from 'gid_t *' (aka 'unsigned int *') to 'int *' for 3rd > argument > [WARNING] int getgrouplist(const char *, int, int *, int *); > [WARNING] ^ > [WARNING] In file included from > /Users/ehiggs/src/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/test/utils/test_docker_util.cc:24: > [WARNING] > /Users/ehiggs/src/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/utils/docker-util.c:1271:9: > error: no matching function for call to 'getgrouplist' > [WARNING] if (getgrouplist(user, pw->pw_gid, groups, &ngroups) == -1) { > [WARNING] ^~~~ > [WARNING] /usr/include/unistd.h:653:6: note: candidate function not viable: > no known conversion from 'gid_t *' (aka 'unsigned int *') to 'int *' for 3rd > argument > [WARNING] int getgrouplist(const char *, int, int *, int *); > [WARNING] ^ > [WARNING] 2 warnings and 2 errors generated. > [WARNING] make[2]: *** > [CMakeFiles/cetest.dir/main/native/container-executor/test/utils/test_docker_util.cc.o] > Error 1 > [WARNING] make[1]: *** [CMakeFiles/cetest.dir/all] Error 2 > [WARNING] make: *** [all] Error 2 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8622) NodeManager native build fails due to getgrouplist not found on macOS
[ https://issues.apache.org/jira/browse/YARN-8622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836628#comment-16836628 ] Eric Yang commented on YARN-8622: - [~smeng], pushed this change to branch-3.1. Thank you. > NodeManager native build fails due to getgrouplist not found on macOS > - > > Key: YARN-8622 > URL: https://issues.apache.org/jira/browse/YARN-8622 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.2.0, 3.3.0 > Environment: Darwin 17.7.0 Darwin Kernel Version 17.7.0: Thu Jun 21 > 22:53:14 PDT 2018; root:xnu-4570.71.2~1/RELEASE_X86_64 x86_64 > Apple LLVM version 9.1.0 (clang-902.0.39.2) >Reporter: Ewan Higgs >Assignee: Siyao Meng >Priority: Major > Fix For: 3.3.0, 3.2.1 > > Attachments: YARN-8622.001.patch, YARN-8622.002.patch > > > Usage of getgrouplist() is added in YARN-7221 and should affect Hadoop 3.2.0 > and later. > Compiler: > {code} > $ /Library/Developer/CommandLineTools/usr/bin/c++ --version > Apple LLVM version 9.1.0 (clang-902.0.39.2) > Target: x86_64-apple-darwin17.7.0 > Thread model: posix > InstalledDir: /Library/Developer/CommandLineTools/usr/bin > {code} > Build line: > {code} > [WARNING] /Library/Developer/CommandLineTools/usr/bin/c++ -g -O2 -Wall > -pthread -D_FILE_OFFSET_BITS=64 -Wl,-search_paths_first > -Wl,-headerpad_max_install_names > CMakeFiles/test-oom-listener.dir/main/native/oom-listener/impl/oom_listener.c.o > > CMakeFiles/test-oom-listener.dir/main/native/oom-listener/test/oom_listener_test_main.cc.o > -o test/test-oom-listener libgtest.a -lrt > {code} > Error message: > {code} > ... > [WARNING] > /Users/ehiggs/src/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/utils/docker-util.c:1264:12: > error: no matching function for call to 'getgrouplist' > [WARNING] int rc = getgrouplist(user, pw->pw_gid, groups, &ngroups); > [WARNING]^~~~ > [WARNING] /usr/include/unistd.h:653:6: note: candidate function not viable: > no known conversion from 'gid_t *' (aka 'unsigned int *') to 'int *' for 3rd > argument > [WARNING] int getgrouplist(const char *, int, int *, int *); > [WARNING] ^ > [WARNING] In file included from > /Users/ehiggs/src/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/test/utils/test_docker_util.cc:24: > [WARNING] > /Users/ehiggs/src/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/utils/docker-util.c:1271:9: > error: no matching function for call to 'getgrouplist' > [WARNING] if (getgrouplist(user, pw->pw_gid, groups, &ngroups) == -1) { > [WARNING] ^~~~ > [WARNING] /usr/include/unistd.h:653:6: note: candidate function not viable: > no known conversion from 'gid_t *' (aka 'unsigned int *') to 'int *' for 3rd > argument > [WARNING] int getgrouplist(const char *, int, int *, int *); > [WARNING] ^ > [WARNING] 2 warnings and 2 errors generated. > [WARNING] make[2]: *** > [CMakeFiles/cetest.dir/main/native/container-executor/test/utils/test_docker_util.cc.o] > Error 1 > [WARNING] make[1]: *** [CMakeFiles/cetest.dir/all] Error 2 > [WARNING] make: *** [all] Error 2 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9537) Add configuration to support AM preemption
[ https://issues.apache.org/jira/browse/YARN-9537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836625#comment-16836625 ] Yufei Gu commented on YARN-9537: FairScheduler doesn't prevent you from preempting the AM container. It just tries to preempt as less AM containers as possible. > Add configuration to support AM preemption > -- > > Key: YARN-9537 > URL: https://issues.apache.org/jira/browse/YARN-9537 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Reporter: zhoukang >Priority: Major > > In our production cluster, we can tolerate am preemption. So we can add a > configuration to support am preemption. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9202) RM does not track nodes that are in the include list and never register
[ https://issues.apache.org/jira/browse/YARN-9202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836617#comment-16836617 ] Eric Payne commented on YARN-9202: -- [~kshukla], thanks for your explanation. I think we should go ahead with your current approach. The current patch does not apply, however, so can you please provide an upmerged patch? > RM does not track nodes that are in the include list and never register > --- > > Key: YARN-9202 > URL: https://issues.apache.org/jira/browse/YARN-9202 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.9.2, 3.0.3, 2.8.5 >Reporter: Kuhu Shukla >Assignee: Kuhu Shukla >Priority: Major > Attachments: YARN-9202.001.patch > > > The RM state machine decides to put new or running nodes in inactive state > only past the point of either registration or being in the exclude list. This > does not cover the case where a node is the in the include list but never > registers and since all state changes are based on these NodeState > transitions, having NEW nodes be listed as inactive first may help. This > would change the semantics of how inactiveNodes are looked at today. Another > state addition might help this case too. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9483) DistributedShell does not release container when failed to localize at launch
[ https://issues.apache.org/jira/browse/YARN-9483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836608#comment-16836608 ] Prabhu Joseph commented on YARN-9483: - Thanks [~giovanni.fumarola] and [~pbacsko]. > DistributedShell does not release container when failed to localize at launch > - > > Key: YARN-9483 > URL: https://issues.apache.org/jira/browse/YARN-9483 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-9483-001.patch > > > DistributedShell does not release container when failed to localize at > launch. The launch threads does not increment completed & failed containers > when failed to localize. And the main thread waits for the containers to > complete without failing the job. > {code} > yarn jar > /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar > -shell_command ls -shell_args / -jar > /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar > -localize_files /tmp/prabhu > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9522) AppBlock ignores full qualified class name of PseudoAuthenticationHandler
[ https://issues.apache.org/jira/browse/YARN-9522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836605#comment-16836605 ] Prabhu Joseph commented on YARN-9522: - Thanks [~giovanni.fumarola] for the review. Attached patch-002 with above change. > AppBlock ignores full qualified class name of PseudoAuthenticationHandler > - > > Key: YARN-9522 > URL: https://issues.apache.org/jira/browse/YARN-9522 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.2.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Minor > Attachments: YARN-9522-001.patch, YARN-9522-002.patch > > > {{AuthenticationHandler}} can be either configured using fqcn or type. > {{AppBlock}} checks for only the type simple and ignores the fqcn of > {{PseudoAuthenticationHandler}} while checking whether ui is secured or not. > {code} >* @param authHandler The short-name (or fully qualified class name) of the >* authentication handler. > {code} > *AppBlock.java* > {code} > // check if UI is unsecured. > String httpAuth = > conf.get(CommonConfigurationKeys.HADOOP_HTTP_AUTHENTICATION_TYPE); > this.unsecuredUI = (httpAuth != null) && httpAuth.equals("simple"); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9522) AppBlock ignores full qualified class name of PseudoAuthenticationHandler
[ https://issues.apache.org/jira/browse/YARN-9522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-9522: Attachment: YARN-9522-002.patch > AppBlock ignores full qualified class name of PseudoAuthenticationHandler > - > > Key: YARN-9522 > URL: https://issues.apache.org/jira/browse/YARN-9522 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.2.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Minor > Attachments: YARN-9522-001.patch, YARN-9522-002.patch > > > {{AuthenticationHandler}} can be either configured using fqcn or type. > {{AppBlock}} checks for only the type simple and ignores the fqcn of > {{PseudoAuthenticationHandler}} while checking whether ui is secured or not. > {code} >* @param authHandler The short-name (or fully qualified class name) of the >* authentication handler. > {code} > *AppBlock.java* > {code} > // check if UI is unsecured. > String httpAuth = > conf.get(CommonConfigurationKeys.HADOOP_HTTP_AUTHENTICATION_TYPE); > this.unsecuredUI = (httpAuth != null) && httpAuth.equals("simple"); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9483) DistributedShell does not release container when failed to localize at launch
[ https://issues.apache.org/jira/browse/YARN-9483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836602#comment-16836602 ] Hudson commented on YARN-9483: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #16532 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/16532/]) YARN-9483. DistributedShell does not release container when failed to (gifuma: rev ec361263464a903348bb80f23801094b4e0570d1) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java > DistributedShell does not release container when failed to localize at launch > - > > Key: YARN-9483 > URL: https://issues.apache.org/jira/browse/YARN-9483 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-9483-001.patch > > > DistributedShell does not release container when failed to localize at > launch. The launch threads does not increment completed & failed containers > when failed to localize. And the main thread waits for the containers to > complete without failing the job. > {code} > yarn jar > /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar > -shell_command ls -shell_args / -jar > /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar > -localize_files /tmp/prabhu > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9522) AppBlock ignores full qualified class name of PseudoAuthenticationHandler
[ https://issues.apache.org/jira/browse/YARN-9522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836595#comment-16836595 ] Giovanni Matteo Fumarola commented on YARN-9522: Thanks [~Prabhu Joseph]. Can you add an additional set of parenthesis to make the statements more readable? e.g. ( (c1) && (c2)) || c3; > AppBlock ignores full qualified class name of PseudoAuthenticationHandler > - > > Key: YARN-9522 > URL: https://issues.apache.org/jira/browse/YARN-9522 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.2.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Minor > Attachments: YARN-9522-001.patch > > > {{AuthenticationHandler}} can be either configured using fqcn or type. > {{AppBlock}} checks for only the type simple and ignores the fqcn of > {{PseudoAuthenticationHandler}} while checking whether ui is secured or not. > {code} >* @param authHandler The short-name (or fully qualified class name) of the >* authentication handler. > {code} > *AppBlock.java* > {code} > // check if UI is unsecured. > String httpAuth = > conf.get(CommonConfigurationKeys.HADOOP_HTTP_AUTHENTICATION_TYPE); > this.unsecuredUI = (httpAuth != null) && httpAuth.equals("simple"); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9483) DistributedShell does not release container when failed to localize at launch
[ https://issues.apache.org/jira/browse/YARN-9483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giovanni Matteo Fumarola updated YARN-9483: --- Fix Version/s: 3.3.0 > DistributedShell does not release container when failed to localize at launch > - > > Key: YARN-9483 > URL: https://issues.apache.org/jira/browse/YARN-9483 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-9483-001.patch > > > DistributedShell does not release container when failed to localize at > launch. The launch threads does not increment completed & failed containers > when failed to localize. And the main thread waits for the containers to > complete without failing the job. > {code} > yarn jar > /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar > -shell_command ls -shell_args / -jar > /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar > -localize_files /tmp/prabhu > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9483) DistributedShell does not release container when failed to localize at launch
[ https://issues.apache.org/jira/browse/YARN-9483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836594#comment-16836594 ] Giovanni Matteo Fumarola commented on YARN-9483: The patch looks good. Committed to trunk. Thanks [~Prabhu Joseph] for the patch and [~pbacsko] for the initial review. > DistributedShell does not release container when failed to localize at launch > - > > Key: YARN-9483 > URL: https://issues.apache.org/jira/browse/YARN-9483 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-9483-001.patch > > > DistributedShell does not release container when failed to localize at > launch. The launch threads does not increment completed & failed containers > when failed to localize. And the main thread waits for the containers to > complete without failing the job. > {code} > yarn jar > /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar > -shell_command ls -shell_args / -jar > /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar > -localize_files /tmp/prabhu > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9482) DistributedShell job with localization fails in unsecure cluster
[ https://issues.apache.org/jira/browse/YARN-9482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836569#comment-16836569 ] Sunil Govindan commented on YARN-9482: -- + [~rohithsharma] To me, this change seems good. If no other objections, lets get this in. [~rohithsharma] pls take a look if you have cycles. Thank you,. > DistributedShell job with localization fails in unsecure cluster > > > Key: YARN-9482 > URL: https://issues.apache.org/jira/browse/YARN-9482 > Project: Hadoop YARN > Issue Type: Bug > Components: distributed-shell >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-9482-001.patch, YARN-9482-002.patch, > YARN-9482-003.patch > > > DistributedShell job with localization fails in unsecure cluster. The client > localizes the input files to home directory (job user) whereas the AM runs as > yarn user reads from it's home directory. > *Command:* > {code} > yarn jar > /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar > -shell_command ls -shell_args / -jar > /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar > -localize_files /tmp/prabhu > {code} > {code} > Exception in thread "Thread-4" java.io.UncheckedIOException: Error during > localization setup > at > org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$LaunchContainerRunnable.lambda$run$0(ApplicationMaster.java:1495) > at > java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382) > at > java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580) > at > org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$LaunchContainerRunnable.run(ApplicationMaster.java:1481) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.io.FileNotFoundException: File does not exist: > hdfs://yarn-ats-1:8020/user/yarn/DistributedShell/application_1554817981283_0003/prabhu > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1586) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1579) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1594) > at > org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$LaunchContainerRunnable.lambda$run$0(ApplicationMaster.java:1487) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9504) [UI2] Fair scheduler queue view page is broken
[ https://issues.apache.org/jira/browse/YARN-9504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836566#comment-16836566 ] Sunil Govindan commented on YARN-9504: -- +1. Committing shortly > [UI2] Fair scheduler queue view page is broken > -- > > Key: YARN-9504 > URL: https://issues.apache.org/jira/browse/YARN-9504 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, yarn-ui-v2 >Affects Versions: 3.2.0, 3.3.0, 3.2.1 >Reporter: Zoltan Siegl >Assignee: Zoltan Siegl >Priority: Major > Fix For: 3.3.0, 3.2.1 > > Attachments: Screenshot 2019-04-23 at 14.52.57.png, Screenshot > 2019-04-23 at 14.59.35.png, YARN-9504.001.patch, YARN-9504.002.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > UI2 queue page currently displays white screen for Fair Scheduler. > > In src/main/webapp/app/components/tree-selector.js:377 (getUsedCapacity) code > refers to > queueData.get("partitionMap") which is null for fair scheduler queue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9483) DistributedShell does not release container when failed to localize at launch
[ https://issues.apache.org/jira/browse/YARN-9483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836522#comment-16836522 ] Prabhu Joseph commented on YARN-9483: - [~giovanni.fumarola] Can you review this jira when you get time. This fixes hanging of DS job when failed to localize at launch. Thanks. > DistributedShell does not release container when failed to localize at launch > - > > Key: YARN-9483 > URL: https://issues.apache.org/jira/browse/YARN-9483 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-9483-001.patch > > > DistributedShell does not release container when failed to localize at > launch. The launch threads does not increment completed & failed containers > when failed to localize. And the main thread waits for the containers to > complete without failing the job. > {code} > yarn jar > /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar > -shell_command ls -shell_args / -jar > /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar > -localize_files /tmp/prabhu > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9482) DistributedShell job with localization fails in unsecure cluster
[ https://issues.apache.org/jira/browse/YARN-9482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836517#comment-16836517 ] Prabhu Joseph commented on YARN-9482: - [~giovanni.fumarola] Can you review this jira when you get time. This fixes DistributedShell job localization failure in unsecure cluster. Thanks. > DistributedShell job with localization fails in unsecure cluster > > > Key: YARN-9482 > URL: https://issues.apache.org/jira/browse/YARN-9482 > Project: Hadoop YARN > Issue Type: Bug > Components: distributed-shell >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-9482-001.patch, YARN-9482-002.patch, > YARN-9482-003.patch > > > DistributedShell job with localization fails in unsecure cluster. The client > localizes the input files to home directory (job user) whereas the AM runs as > yarn user reads from it's home directory. > *Command:* > {code} > yarn jar > /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar > -shell_command ls -shell_args / -jar > /HADOOP/hadoop-3.2.0/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.2.0.jar > -localize_files /tmp/prabhu > {code} > {code} > Exception in thread "Thread-4" java.io.UncheckedIOException: Error during > localization setup > at > org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$LaunchContainerRunnable.lambda$run$0(ApplicationMaster.java:1495) > at > java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382) > at > java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580) > at > org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$LaunchContainerRunnable.run(ApplicationMaster.java:1481) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.io.FileNotFoundException: File does not exist: > hdfs://yarn-ats-1:8020/user/yarn/DistributedShell/application_1554817981283_0003/prabhu > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1586) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1579) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1594) > at > org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster$LaunchContainerRunnable.lambda$run$0(ApplicationMaster.java:1487) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9527) Rogue LocalizerRunner/ContainerLocalizer repeatedly downloading same file
[ https://issues.apache.org/jira/browse/YARN-9527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836478#comment-16836478 ] Hadoop QA commented on YARN-9527: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 12s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s{color} | {color:green} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 0 new + 190 unchanged - 25 fixed = 190 total (was 215) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 44s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 20m 51s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 67m 59s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e | | JIRA Issue | YARN-9527 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12968307/YARN-9527.004.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux ba03302ebd88 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 90add05 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_191 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/24073/testReport/ | | Max. process+thread count | 412 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/24073/console | | Powered by
[jira] [Commented] (YARN-9527) Rogue LocalizerRunner/ContainerLocalizer repeatedly downloading same file
[ https://issues.apache.org/jira/browse/YARN-9527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836456#comment-16836456 ] Jim Brennan commented on YARN-9527: --- Thanks for the review [~ebadger]! I've put up another patch that adds the interrupt() call back in for the running containers case. I'm not sure it's needed, but I think it's safer to keep that code path unchanged. {quote} Moving the getPathForLocalization() logic into findNextResource() makes a lot of sense so we don't have to go through the bad resources one heartbeat at a time and so we'll actually remove them from the pending list. {quote} Agreed. It is possible that this change alone will minimize the window enough to prevent the problem by itself. Instead of taking n seconds to process (and remove) n resources from the rogue container pending list, it will do it in one heartbeat, with far less opportunity for another container to start with the same resources. {quote} I'm not super wild about adding an LRU cache of 128 recent entries since it only makes the race less likely to occur instead of fixing it outright. However, this code is very complex and I can understand why you would want to make a minimally invasive change. I would like to hear other peoples' thoughts on this. {quote} The more bullet proof fix would be to change the LocalizerTracker.handle() function to look up the container state and only accept the request if the container was in the correct state. Currently the LocalizerTracker doesn't access the container directly, so it would either have to lookup the container from the container id (which I'm not certain is set for all requests) or I would have to change the LocalizerContext to include the container directly. I was concerned that this might be a performance hit (due to the synchronized containers list), since we would have to do this for every request from every container. I admit that the LRU approach is not 100% bullet proof, but combined with the findNextResources change, I think it is sufficient to cover the very short window in which this problem can occur, and it limits the change to a small part of the code. I am open to suggestions on how big it needs to be. {quote} It would also be good to prove that this fix actually works, and more importantly doesn't break anything else. So I think we should definitely wait for that until we put this in (if others agree with the approach) {quote} I think the unit test does show that the problem as I understand it is fixed (it fails with the old code and succeeds with the new), but I am also attempting to repro the failure manually, and will look into getting this fix deployed locally so we can test it on a larger cluster. Thanks again for your feedback [~ebadger], it would be good to get some other eyes on this as well, given the complexity of the localization code. > Rogue LocalizerRunner/ContainerLocalizer repeatedly downloading same file > - > > Key: YARN-9527 > URL: https://issues.apache.org/jira/browse/YARN-9527 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.8.5, 3.1.2 >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Attachments: YARN-9527.001.patch, YARN-9527.002.patch, > YARN-9527.003.patch, YARN-9527.004.patch > > > A rogue ContainerLocalizer can get stuck in a loop continuously downloading > the same file while generating an "Invalid event: LOCALIZED at LOCALIZED" > exception on each iteration. Sometimes this continues long enough that it > fills up a disk or depletes available inodes for the filesystem. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9541) TestCombinedSystemMetricsPublisher fails intermittent
Prabhu Joseph created YARN-9541: --- Summary: TestCombinedSystemMetricsPublisher fails intermittent Key: YARN-9541 URL: https://issues.apache.org/jira/browse/YARN-9541 Project: Hadoop YARN Issue Type: Bug Components: ATSv2 Affects Versions: 3.2.0 Reporter: Prabhu Joseph Assignee: Prabhu Joseph org.apache.hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher.testTimelineServiceEventPublishingV1V2Enabled {code} Failing for the past 1 build (Since Failed#24071 ) Took 0.19 sec. Error Message java.net.BindException: Problem binding to [0.0.0.0:10200] java.net.BindException: Address already in use; For more details see: http://wiki.apache.org/hadoop/BindException Stacktrace org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.BindException: Problem binding to [0.0.0.0:10200] java.net.BindException: Address already in use; For more details see: http://wiki.apache.org/hadoop/BindException at org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:139) at org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getServer(HadoopYarnProtoRPC.java:66) at org.apache.hadoop.yarn.ipc.YarnRPC.getServer(YarnRPC.java:55) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryClientService.serviceStart(ApplicationHistoryClientService.java:94) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.serviceStart(ApplicationHistoryServer.java:120) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) at org.apache.hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher.testSetup(TestCombinedSystemMetricsPublisher.java:123) at org.apache.hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher.runTest(TestCombinedSystemMetricsPublisher.java:242) at org.apache.hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher.testTimelineServiceEventPublishingV1V2Enabled(TestCombinedSystemMetricsPublisher.java:252) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.lang.Thread.run(Thread.java:748) Caused by: java.net.BindException: Problem binding to [0.0.0.0:10200] java.net.BindException: Address already in use; For more details see: http://wiki.apache.org/hadoop/BindException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:833) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:738) at org.apache.hadoop.ipc.Server.bind(Server.java:599) at org.apache.hadoop.ipc.Server$Listener.(Server.java:1121) at org.apache.hadoop.ipc.Server.(Server.java:2976) at org.apache.hadoop.ipc.RPC$Server.(RPC.java:1039) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:427) at org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:347) at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:848) at org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.createServer(RpcServerFactoryPBImpl.java:173) at org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl.getServer(RpcServerFactoryPBImpl.java:132) ... 22 more Caused by: java.net.BindException: Address already in use
[jira] [Created] (YARN-9540) TestRMAppTransitions fails intermittently
Prabhu Joseph created YARN-9540: --- Summary: TestRMAppTransitions fails intermittently Key: YARN-9540 URL: https://issues.apache.org/jira/browse/YARN-9540 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, test Affects Versions: 3.2.0 Reporter: Prabhu Joseph Assignee: Prabhu Joseph Failed org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppFinishedFinished[0] {code} Error Message expected:<1> but was:<0> Stacktrace java.lang.AssertionError: expected:<1> but was:<0> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:834) at org.junit.Assert.assertEquals(Assert.java:645) at org.junit.Assert.assertEquals(Assert.java:631) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.verifyAppCompletedEvent(TestRMAppTransitions.java:1307) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.verifyAppAfterFinishEvent(TestRMAppTransitions.java:1302) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testCreateAppFinished(TestRMAppTransitions.java:648) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions.testAppFinishedFinished(TestRMAppTransitions.java:1083) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.junit.runners.Suite.runChild(Suite.java:128) at org.junit.runners.Suite.runChild(Suite.java:27) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9527) Rogue LocalizerRunner/ContainerLocalizer repeatedly downloading same file
[ https://issues.apache.org/jira/browse/YARN-9527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Brennan updated YARN-9527: -- Attachment: YARN-9527.004.patch > Rogue LocalizerRunner/ContainerLocalizer repeatedly downloading same file > - > > Key: YARN-9527 > URL: https://issues.apache.org/jira/browse/YARN-9527 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.8.5, 3.1.2 >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Attachments: YARN-9527.001.patch, YARN-9527.002.patch, > YARN-9527.003.patch, YARN-9527.004.patch > > > A rogue ContainerLocalizer can get stuck in a loop continuously downloading > the same file while generating an "Invalid event: LOCALIZED at LOCALIZED" > exception on each iteration. Sometimes this continues long enough that it > fills up a disk or depletes available inodes for the filesystem. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9489) Support filtering by request-priorities and allocation-request-ids for query results of app activities
[ https://issues.apache.org/jira/browse/YARN-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836417#comment-16836417 ] Hudson commented on YARN-9489: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #16530 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/16530/]) YARN-9489. Support filtering by request-priorities and (wwei: rev 90add05caa6c48659f0c592ec391b30f2a76069e) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/test/java/org/apache/hadoop/yarn/server/router/webapp/PassThroughRESTRequestInterceptor.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/main/java/org/apache/hadoop/yarn/server/router/webapp/FederationInterceptorREST.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/activities/ActivitiesManager.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/main/java/org/apache/hadoop/yarn/server/router/webapp/RouterWebServices.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/activities/AppAllocation.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWSConsts.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesSchedulerActivities.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/test/java/org/apache/hadoop/yarn/server/router/webapp/BaseRouterWebServicesTest.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/ActivitiesTestUtils.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServiceProtocol.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/main/java/org/apache/hadoop/yarn/server/router/webapp/DefaultRequestInterceptorREST.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/test/java/org/apache/hadoop/yarn/server/router/webapp/MockRESTRequestInterceptor.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/RMWebServices.java > Support filtering by request-priorities and allocation-request-ids for query > results of app activities > -- > > Key: YARN-9489 > URL: https://issues.apache.org/jira/browse/YARN-9489 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-9489.001.patch, YARN-9489.002.patch > > > [Design Doc > #4.2|https://docs.google.com/document/d/1pwf-n3BCLW76bGrmNPM4T6pQ3vC4dVMcN2Ud1hq1t2M/edit#heading=h.m04tqsosk94h] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9489) Support filtering by request-priorities and allocation-request-ids for query results of app activities
[ https://issues.apache.org/jira/browse/YARN-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836401#comment-16836401 ] Weiwei Yang commented on YARN-9489: --- Thanks for confirming it, +1. Committing shortly. > Support filtering by request-priorities and allocation-request-ids for query > results of app activities > -- > > Key: YARN-9489 > URL: https://issues.apache.org/jira/browse/YARN-9489 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-9489.001.patch, YARN-9489.002.patch > > > [Design Doc > #4.2|https://docs.google.com/document/d/1pwf-n3BCLW76bGrmNPM4T6pQ3vC4dVMcN2Ud1hq1t2M/edit#heading=h.m04tqsosk94h] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8622) NodeManager native build fails due to getgrouplist not found on macOS
[ https://issues.apache.org/jira/browse/YARN-8622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836393#comment-16836393 ] Siyao Meng commented on YARN-8622: -- [~eyang] I just added target branch 3.1. If you could commit this to branch-3.1 as well? Since YARN-7221 is also in branch-3.1. I missed that target branch before. Thank you! > NodeManager native build fails due to getgrouplist not found on macOS > - > > Key: YARN-8622 > URL: https://issues.apache.org/jira/browse/YARN-8622 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.2.0, 3.3.0 > Environment: Darwin 17.7.0 Darwin Kernel Version 17.7.0: Thu Jun 21 > 22:53:14 PDT 2018; root:xnu-4570.71.2~1/RELEASE_X86_64 x86_64 > Apple LLVM version 9.1.0 (clang-902.0.39.2) >Reporter: Ewan Higgs >Assignee: Siyao Meng >Priority: Major > Fix For: 3.3.0, 3.2.1 > > Attachments: YARN-8622.001.patch, YARN-8622.002.patch > > > Usage of getgrouplist() is added in YARN-7221 and should affect Hadoop 3.2.0 > and later. > Compiler: > {code} > $ /Library/Developer/CommandLineTools/usr/bin/c++ --version > Apple LLVM version 9.1.0 (clang-902.0.39.2) > Target: x86_64-apple-darwin17.7.0 > Thread model: posix > InstalledDir: /Library/Developer/CommandLineTools/usr/bin > {code} > Build line: > {code} > [WARNING] /Library/Developer/CommandLineTools/usr/bin/c++ -g -O2 -Wall > -pthread -D_FILE_OFFSET_BITS=64 -Wl,-search_paths_first > -Wl,-headerpad_max_install_names > CMakeFiles/test-oom-listener.dir/main/native/oom-listener/impl/oom_listener.c.o > > CMakeFiles/test-oom-listener.dir/main/native/oom-listener/test/oom_listener_test_main.cc.o > -o test/test-oom-listener libgtest.a -lrt > {code} > Error message: > {code} > ... > [WARNING] > /Users/ehiggs/src/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/utils/docker-util.c:1264:12: > error: no matching function for call to 'getgrouplist' > [WARNING] int rc = getgrouplist(user, pw->pw_gid, groups, &ngroups); > [WARNING]^~~~ > [WARNING] /usr/include/unistd.h:653:6: note: candidate function not viable: > no known conversion from 'gid_t *' (aka 'unsigned int *') to 'int *' for 3rd > argument > [WARNING] int getgrouplist(const char *, int, int *, int *); > [WARNING] ^ > [WARNING] In file included from > /Users/ehiggs/src/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/test/utils/test_docker_util.cc:24: > [WARNING] > /Users/ehiggs/src/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/utils/docker-util.c:1271:9: > error: no matching function for call to 'getgrouplist' > [WARNING] if (getgrouplist(user, pw->pw_gid, groups, &ngroups) == -1) { > [WARNING] ^~~~ > [WARNING] /usr/include/unistd.h:653:6: note: candidate function not viable: > no known conversion from 'gid_t *' (aka 'unsigned int *') to 'int *' for 3rd > argument > [WARNING] int getgrouplist(const char *, int, int *, int *); > [WARNING] ^ > [WARNING] 2 warnings and 2 errors generated. > [WARNING] make[2]: *** > [CMakeFiles/cetest.dir/main/native/container-executor/test/utils/test_docker_util.cc.o] > Error 1 > [WARNING] make[1]: *** [CMakeFiles/cetest.dir/all] Error 2 > [WARNING] make: *** [all] Error 2 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-9508) YarnConfiguration areNodeLabel enabled is costly in allocation flow
[ https://issues.apache.org/jira/browse/YARN-9508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836385#comment-16836385 ] Bilwa S T edited comment on YARN-9508 at 5/9/19 1:39 PM: - Hi [~bibinchundatt] . Thanks for reviewing 1. Below CheckStyle issue can't be handled: {quote}public static void normalizeAndValidateRequest(ResourceRequest resReq,:22: More than 7 parameters (found 8). [ParameterNumber] {quote} . 2. Test Case Failures are random .Not related to my changes was (Author: bilwast): Hi [~bibinchundatt] > YarnConfiguration areNodeLabel enabled is costly in allocation flow > --- > > Key: YARN-9508 > URL: https://issues.apache.org/jira/browse/YARN-9508 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bilwa S T >Priority: Critical > Attachments: YARN-9508-001.patch, YARN-9508-002.patch > > > For every allocate request locking can be avoided. Improving performance > {noformat} > "pool-6-thread-300" #624 prio=5 os_prio=0 tid=0x7f2f91152800 nid=0x8ec5 > waiting for monitor entry [0x7f1ec6a8d000] > java.lang.Thread.State: BLOCKED (on object monitor) > at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2841) > - waiting to lock <0x7f1f8107c748> (a > org.apache.hadoop.yarn.conf.YarnConfiguration) > at org.apache.hadoop.conf.Configuration.get(Configuration.java:1214) > at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:1268) > at org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:1674) > at > org.apache.hadoop.yarn.conf.YarnConfiguration.areNodeLabelsEnabled(YarnConfiguration.java:3646) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:234) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndvalidateRequest(SchedulerUtils.java:274) > at > org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.normalizeAndValidateRequests(RMServerUtils.java:261) > at > org.apache.hadoop.yarn.server.resourcemanager.DefaultAMSProcessor.allocate(DefaultAMSProcessor.java:242) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.processor.DisabledPlacementProcessor.allocate(DisabledPlacementProcessor.java:75) > at > org.apache.hadoop.yarn.server.resourcemanager.AMSProcessingChain.allocate(AMSProcessingChain.java:92) > at > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:427) > - locked <0x7f24dd3f9e40> (a > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService$AllocateResponseLock) > at > org.apache.hadoop.yarn.sls.appmaster.MRAMSimulator$1.run(MRAMSimulator.java:352) > at > org.apache.hadoop.yarn.sls.appmaster.MRAMSimulator$1.run(MRAMSimulator.java:349) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) > at > org.apache.hadoop.yarn.sls.appmaster.MRAMSimulator.sendContainerRequest(MRAMSimulator.java:348) > at > org.apache.hadoop.yarn.sls.appmaster.AMSimulator.middleStep(AMSimulator.java:212) > at > org.apache.hadoop.yarn.sls.scheduler.TaskRunner$Task.run(TaskRunner.java:94) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9508) YarnConfiguration areNodeLabel enabled is costly in allocation flow
[ https://issues.apache.org/jira/browse/YARN-9508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836385#comment-16836385 ] Bilwa S T commented on YARN-9508: - Hi [~bibinchundatt] > YarnConfiguration areNodeLabel enabled is costly in allocation flow > --- > > Key: YARN-9508 > URL: https://issues.apache.org/jira/browse/YARN-9508 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bilwa S T >Priority: Critical > Attachments: YARN-9508-001.patch, YARN-9508-002.patch > > > For every allocate request locking can be avoided. Improving performance > {noformat} > "pool-6-thread-300" #624 prio=5 os_prio=0 tid=0x7f2f91152800 nid=0x8ec5 > waiting for monitor entry [0x7f1ec6a8d000] > java.lang.Thread.State: BLOCKED (on object monitor) > at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2841) > - waiting to lock <0x7f1f8107c748> (a > org.apache.hadoop.yarn.conf.YarnConfiguration) > at org.apache.hadoop.conf.Configuration.get(Configuration.java:1214) > at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:1268) > at org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:1674) > at > org.apache.hadoop.yarn.conf.YarnConfiguration.areNodeLabelsEnabled(YarnConfiguration.java:3646) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:234) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndvalidateRequest(SchedulerUtils.java:274) > at > org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.normalizeAndValidateRequests(RMServerUtils.java:261) > at > org.apache.hadoop.yarn.server.resourcemanager.DefaultAMSProcessor.allocate(DefaultAMSProcessor.java:242) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.processor.DisabledPlacementProcessor.allocate(DisabledPlacementProcessor.java:75) > at > org.apache.hadoop.yarn.server.resourcemanager.AMSProcessingChain.allocate(AMSProcessingChain.java:92) > at > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:427) > - locked <0x7f24dd3f9e40> (a > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService$AllocateResponseLock) > at > org.apache.hadoop.yarn.sls.appmaster.MRAMSimulator$1.run(MRAMSimulator.java:352) > at > org.apache.hadoop.yarn.sls.appmaster.MRAMSimulator$1.run(MRAMSimulator.java:349) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) > at > org.apache.hadoop.yarn.sls.appmaster.MRAMSimulator.sendContainerRequest(MRAMSimulator.java:348) > at > org.apache.hadoop.yarn.sls.appmaster.AMSimulator.middleStep(AMSimulator.java:212) > at > org.apache.hadoop.yarn.sls.scheduler.TaskRunner$Task.run(TaskRunner.java:94) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9508) YarnConfiguration areNodeLabel enabled is costly in allocation flow
[ https://issues.apache.org/jira/browse/YARN-9508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836382#comment-16836382 ] Hadoop QA commented on YARN-9508: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 10s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 25s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 1 new + 135 unchanged - 0 fixed = 136 total (was 135) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 12s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 80m 59s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 29s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}129m 7s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher | | | hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e | | JIRA Issue | YARN-9508 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12968291/YARN-9508-002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux eb11e171cc8e 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 2595125 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_191 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/24071/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | unit | https://builds.apache.org/jo
[jira] [Commented] (YARN-9522) AppBlock ignores full qualified class name of PseudoAuthenticationHandler
[ https://issues.apache.org/jira/browse/YARN-9522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836365#comment-16836365 ] Prabhu Joseph commented on YARN-9522: - [~giovanni.fumarola] Can you review this jira when you get time. Thanks. > AppBlock ignores full qualified class name of PseudoAuthenticationHandler > - > > Key: YARN-9522 > URL: https://issues.apache.org/jira/browse/YARN-9522 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.2.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Minor > Attachments: YARN-9522-001.patch > > > {{AuthenticationHandler}} can be either configured using fqcn or type. > {{AppBlock}} checks for only the type simple and ignores the fqcn of > {{PseudoAuthenticationHandler}} while checking whether ui is secured or not. > {code} >* @param authHandler The short-name (or fully qualified class name) of the >* authentication handler. > {code} > *AppBlock.java* > {code} > // check if UI is unsecured. > String httpAuth = > conf.get(CommonConfigurationKeys.HADOOP_HTTP_AUTHENTICATION_TYPE); > this.unsecuredUI = (httpAuth != null) && httpAuth.equals("simple"); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9522) AppBlock ignores full qualified class name of PseudoAuthenticationHandler
[ https://issues.apache.org/jira/browse/YARN-9522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836325#comment-16836325 ] Hadoop QA commented on YARN-9522: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 53s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 12s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 38s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 26s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 48m 10s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e | | JIRA Issue | YARN-9522 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12968292/YARN-9522-001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 29d76e063364 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 2595125 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_191 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/24072/testReport/ | | Max. process+thread count | 446 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/24072/console | | Powered by | Apache Yetus 0.8.0 http://yetu
[jira] [Commented] (YARN-9489) Support filtering by request-priorities and allocation-request-ids for query results of app activities
[ https://issues.apache.org/jira/browse/YARN-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836312#comment-16836312 ] Tao Yang commented on YARN-9489: Thanks [~cheersyang] for the review. {quote} {{getAppActivities()}} was to add two new query parameters, that means we are not breaking old APIs correct? {quote} Yes, this change will enhance old APIs instead of breaking them. {quote} Another thing is please create another Jira to add some doc about the {{app-activities}} restful API in RM rest doc. {quote} Sure, I will fulfill the document and REST APIs in YARN-9538. > Support filtering by request-priorities and allocation-request-ids for query > results of app activities > -- > > Key: YARN-9489 > URL: https://issues.apache.org/jira/browse/YARN-9489 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-9489.001.patch, YARN-9489.002.patch > > > [Design Doc > #4.2|https://docs.google.com/document/d/1pwf-n3BCLW76bGrmNPM4T6pQ3vC4dVMcN2Ud1hq1t2M/edit#heading=h.m04tqsosk94h] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9539) Improve cleanup process of app activities and make some conditions configurable
Tao Yang created YARN-9539: -- Summary: Improve cleanup process of app activities and make some conditions configurable Key: YARN-9539 URL: https://issues.apache.org/jira/browse/YARN-9539 Project: Hadoop YARN Issue Type: Sub-task Components: capacityscheduler Environment: [YARN-9050 Design doc #4.4|https://docs.google.com/document/d/1pwf-n3BCLW76bGrmNPM4T6pQ3vC4dVMcN2Ud1hq1t2M/edit#heading=h.crdawajmm3a4] Reporter: Tao Yang Assignee: Tao Yang -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9538) Document scheduler/app activities and REST APIs
Tao Yang created YARN-9538: -- Summary: Document scheduler/app activities and REST APIs Key: YARN-9538 URL: https://issues.apache.org/jira/browse/YARN-9538 Project: Hadoop YARN Issue Type: Sub-task Components: documentation Reporter: Tao Yang Assignee: Tao Yang Add documentation for scheduler/app activities in CapacityScheduler.md and ResourceManagerRest.md. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9522) AppBlock ignores full qualified class name of PseudoAuthenticationHandler
[ https://issues.apache.org/jira/browse/YARN-9522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836292#comment-16836292 ] Prabhu Joseph commented on YARN-9522: - When hadoop.http.authentication.type is set to simple, yarn ui is considered unsecured and "Kill Button" is not displayed whereas displayed with org.apache.hadoop.security.authentication.server.PseudoAuthenticationHandler Have tested the fix and works fine. > AppBlock ignores full qualified class name of PseudoAuthenticationHandler > - > > Key: YARN-9522 > URL: https://issues.apache.org/jira/browse/YARN-9522 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.2.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Minor > Attachments: YARN-9522-001.patch > > > {{AuthenticationHandler}} can be either configured using fqcn or type. > {{AppBlock}} checks for only the type simple and ignores the fqcn of > {{PseudoAuthenticationHandler}} while checking whether ui is secured or not. > {code} >* @param authHandler The short-name (or fully qualified class name) of the >* authentication handler. > {code} > *AppBlock.java* > {code} > // check if UI is unsecured. > String httpAuth = > conf.get(CommonConfigurationKeys.HADOOP_HTTP_AUTHENTICATION_TYPE); > this.unsecuredUI = (httpAuth != null) && httpAuth.equals("simple"); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9522) AppBlock ignores full qualified class name of PseudoAuthenticationHandler
[ https://issues.apache.org/jira/browse/YARN-9522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prabhu Joseph updated YARN-9522: Attachment: YARN-9522-001.patch > AppBlock ignores full qualified class name of PseudoAuthenticationHandler > - > > Key: YARN-9522 > URL: https://issues.apache.org/jira/browse/YARN-9522 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.2.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Minor > Attachments: YARN-9522-001.patch > > > {{AuthenticationHandler}} can be either configured using fqcn or type. > {{AppBlock}} checks for only the type simple and ignores the fqcn of > {{PseudoAuthenticationHandler}} while checking whether ui is secured or not. > {code} >* @param authHandler The short-name (or fully qualified class name) of the >* authentication handler. > {code} > *AppBlock.java* > {code} > // check if UI is unsecured. > String httpAuth = > conf.get(CommonConfigurationKeys.HADOOP_HTTP_AUTHENTICATION_TYPE); > this.unsecuredUI = (httpAuth != null) && httpAuth.equals("simple"); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9508) YarnConfiguration areNodeLabel enabled is costly in allocation flow
[ https://issues.apache.org/jira/browse/YARN-9508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated YARN-9508: Attachment: YARN-9508-002.patch > YarnConfiguration areNodeLabel enabled is costly in allocation flow > --- > > Key: YARN-9508 > URL: https://issues.apache.org/jira/browse/YARN-9508 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bilwa S T >Priority: Critical > Attachments: YARN-9508-001.patch, YARN-9508-002.patch > > > For every allocate request locking can be avoided. Improving performance > {noformat} > "pool-6-thread-300" #624 prio=5 os_prio=0 tid=0x7f2f91152800 nid=0x8ec5 > waiting for monitor entry [0x7f1ec6a8d000] > java.lang.Thread.State: BLOCKED (on object monitor) > at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2841) > - waiting to lock <0x7f1f8107c748> (a > org.apache.hadoop.yarn.conf.YarnConfiguration) > at org.apache.hadoop.conf.Configuration.get(Configuration.java:1214) > at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:1268) > at org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:1674) > at > org.apache.hadoop.yarn.conf.YarnConfiguration.areNodeLabelsEnabled(YarnConfiguration.java:3646) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:234) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndvalidateRequest(SchedulerUtils.java:274) > at > org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.normalizeAndValidateRequests(RMServerUtils.java:261) > at > org.apache.hadoop.yarn.server.resourcemanager.DefaultAMSProcessor.allocate(DefaultAMSProcessor.java:242) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.processor.DisabledPlacementProcessor.allocate(DisabledPlacementProcessor.java:75) > at > org.apache.hadoop.yarn.server.resourcemanager.AMSProcessingChain.allocate(AMSProcessingChain.java:92) > at > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:427) > - locked <0x7f24dd3f9e40> (a > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService$AllocateResponseLock) > at > org.apache.hadoop.yarn.sls.appmaster.MRAMSimulator$1.run(MRAMSimulator.java:352) > at > org.apache.hadoop.yarn.sls.appmaster.MRAMSimulator$1.run(MRAMSimulator.java:349) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) > at > org.apache.hadoop.yarn.sls.appmaster.MRAMSimulator.sendContainerRequest(MRAMSimulator.java:348) > at > org.apache.hadoop.yarn.sls.appmaster.AMSimulator.middleStep(AMSimulator.java:212) > at > org.apache.hadoop.yarn.sls.scheduler.TaskRunner$Task.run(TaskRunner.java:94) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-9537) Add configuration to support AM preemption
[ https://issues.apache.org/jira/browse/YARN-9537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836256#comment-16836256 ] Liu Shaohui edited comment on YARN-9537 at 5/9/19 10:04 AM: support -> disable? was (Author: liushaohui): h1. support -> disable > Add configuration to support AM preemption > -- > > Key: YARN-9537 > URL: https://issues.apache.org/jira/browse/YARN-9537 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Reporter: zhoukang >Priority: Major > > In our production cluster, we can tolerate am preemption. So we can add a > configuration to support am preemption. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9537) Add configuration to support AM preemption
[ https://issues.apache.org/jira/browse/YARN-9537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836256#comment-16836256 ] Liu Shaohui commented on YARN-9537: --- h1. support -> disable > Add configuration to support AM preemption > -- > > Key: YARN-9537 > URL: https://issues.apache.org/jira/browse/YARN-9537 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Reporter: zhoukang >Priority: Major > > In our production cluster, we can tolerate am preemption. So we can add a > configuration to support am preemption. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9519) TFile log aggregation file format is insensitive to the yarn.log-aggregation.TFile.remote-app-log-dir config
[ https://issues.apache.org/jira/browse/YARN-9519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836239#comment-16836239 ] Sunil Govindan commented on YARN-9519: -- @[sumasai.shivapra...@gmail.com|mailto:sumasai.shivapra...@gmail.com] Cud u pls take a look > TFile log aggregation file format is insensitive to the > yarn.log-aggregation.TFile.remote-app-log-dir config > > > Key: YARN-9519 > URL: https://issues.apache.org/jira/browse/YARN-9519 > Project: Hadoop YARN > Issue Type: Bug > Components: log-aggregation >Affects Versions: 3.2.0 >Reporter: Adam Antal >Assignee: Adam Antal >Priority: Major > Attachments: YARN-9519.001.patch, YARN-9519.002.patch, > YARN-9519.003.patch, YARN-9519.004.patch, YARN-9519.005.patch > > > The TFile log aggregation file format is not sensitive to the > yarn.log-aggregation.TFile.remote-app-log-dir config. > In {{LogAggregationTFileController$initInternal}}: > {code:java} > this.remoteRootLogDir = new Path( > conf.get(YarnConfiguration.NM_REMOTE_APP_LOG_DIR, > YarnConfiguration.DEFAULT_NM_REMOTE_APP_LOG_DIR)); > {code} > So the remoteRootLogDir is only aware of the > yarn.nodemanager.remote-app-log-dir config, while other file format, like > IFile defaults to the file format config, so its priority is higher. > From {{LogAggregationIndexedFileController$initInternal}}: > {code:java} > String remoteDirStr = String.format( > YarnConfiguration.LOG_AGGREGATION_REMOTE_APP_LOG_DIR_FMT, > this.fileControllerName); > String remoteDir = conf.get(remoteDirStr); > if (remoteDir == null || remoteDir.isEmpty()) { > remoteDir = conf.get(YarnConfiguration.NM_REMOTE_APP_LOG_DIR, > YarnConfiguration.DEFAULT_NM_REMOTE_APP_LOG_DIR); > } > {code} > (Where these configs are: ) > {code:java} > public static final String LOG_AGGREGATION_REMOTE_APP_LOG_DIR_FMT > = YARN_PREFIX + "log-aggregation.%s.remote-app-log-dir"; > public static final String NM_REMOTE_APP_LOG_DIR = > NM_PREFIX + "remote-app-log-dir"; > {code} > I suggest TFile should try to obtain the remote dir config from > yarn.log-aggregation.TFile.remote-app-log-dir first, and only if that is not > specified falls back to the yarn.nodemanager.remote-app-log-dir config. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9537) Add configuration to support AM preemption
zhoukang created YARN-9537: -- Summary: Add configuration to support AM preemption Key: YARN-9537 URL: https://issues.apache.org/jira/browse/YARN-9537 Project: Hadoop YARN Issue Type: Improvement Components: fairscheduler Reporter: zhoukang In our production cluster, we can tolerate am preemption. So we can add a configuration to support am preemption. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9489) Support filtering by request-priorities and allocation-request-ids for query results of app activities
[ https://issues.apache.org/jira/browse/YARN-9489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836208#comment-16836208 ] Weiwei Yang commented on YARN-9489: --- Hi [~Tao Yang] Most changes seem good to me. Only one thing in \{{RMWebServiceProtocol}}, the changes to \{{#getAppActivities()}} was to add two new query parameters, that means we are not breaking old APIs correct? Just want to confirm. Another thing is please create a another Jira to add some doc about the {{app-activities}} restful API in RM rest doc. Currently it is missing. Thanks > Support filtering by request-priorities and allocation-request-ids for query > results of app activities > -- > > Key: YARN-9489 > URL: https://issues.apache.org/jira/browse/YARN-9489 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-9489.001.patch, YARN-9489.002.patch > > > [Design Doc > #4.2|https://docs.google.com/document/d/1pwf-n3BCLW76bGrmNPM4T6pQ3vC4dVMcN2Ud1hq1t2M/edit#heading=h.m04tqsosk94h] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org