[jira] [Commented] (YARN-9235) If linux container executor is not set for a GPU cluster GpuResourceHandlerImpl is not initialized and NPE is thrown
[ https://issues.apache.org/jira/browse/YARN-9235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16883896#comment-16883896 ] Szilard Nemeth commented on YARN-9235: -- Committed to trunk, 3.2 and 3.1 branches. Thanks [~bsteinbach] for the contribution and [~adam.antal] for the review! > If linux container executor is not set for a GPU cluster > GpuResourceHandlerImpl is not initialized and NPE is thrown > > > Key: YARN-9235 > URL: https://issues.apache.org/jira/browse/YARN-9235 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.0, 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Adam Antal >Priority: Major > Attachments: YARN-9235.001.patch, YARN-9235.002.patch, > YARN-9235.003.patch, YARN-9235.004.patch, YARN-9235.004.patch, > YARN-9235.branch-3.1.001.patch, YARN-9235.branch-3.2.001.patch > > > If GPU plugin is enabled for the NodeManager, it is possible to run jobs with > GPU. > However, if LinuxContainerExecutor is not configured, an NPE is thrown when > calling > {code:java} > GpuResourcePlugin.getNMResourceInfo{code} > Also, there are no warns in the log if GPU is misconfigured like this. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9235) If linux container executor is not set for a GPU cluster GpuResourceHandlerImpl is not initialized and NPE is thrown
[ https://issues.apache.org/jira/browse/YARN-9235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16883897#comment-16883897 ] Hudson commented on YARN-9235: -- FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #16905 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/16905/]) YARN-9235. If linux container executor is not set for a GPU cluster (snemeth: rev c416284bb7581747beef36d7899d8680fe33abbd) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/resourceplugin/gpu/GpuResourcePlugin.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/resourceplugin/gpu/TestGpuResourcePlugin.java > If linux container executor is not set for a GPU cluster > GpuResourceHandlerImpl is not initialized and NPE is thrown > > > Key: YARN-9235 > URL: https://issues.apache.org/jira/browse/YARN-9235 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.0, 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Adam Antal >Priority: Major > Attachments: YARN-9235.001.patch, YARN-9235.002.patch, > YARN-9235.003.patch, YARN-9235.004.patch, YARN-9235.004.patch, > YARN-9235.branch-3.1.001.patch, YARN-9235.branch-3.2.001.patch > > > If GPU plugin is enabled for the NodeManager, it is possible to run jobs with > GPU. > However, if LinuxContainerExecutor is not configured, an NPE is thrown when > calling > {code:java} > GpuResourcePlugin.getNMResourceInfo{code} > Also, there are no warns in the log if GPU is misconfigured like this. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9235) If linux container executor is not set for a GPU cluster GpuResourceHandlerImpl is not initialized and NPE is thrown
[ https://issues.apache.org/jira/browse/YARN-9235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16883877#comment-16883877 ] Szilard Nemeth commented on YARN-9235: -- Latest patch looks good, giving +1 and committing soon! > If linux container executor is not set for a GPU cluster > GpuResourceHandlerImpl is not initialized and NPE is thrown > > > Key: YARN-9235 > URL: https://issues.apache.org/jira/browse/YARN-9235 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.0, 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Adam Antal >Priority: Major > Attachments: YARN-9235.001.patch, YARN-9235.002.patch, > YARN-9235.003.patch, YARN-9235.004.patch, YARN-9235.004.patch > > > If GPU plugin is enabled for the NodeManager, it is possible to run jobs with > GPU. > However, if LinuxContainerExecutor is not configured, an NPE is thrown when > calling > {code:java} > GpuResourcePlugin.getNMResourceInfo{code} > Also, there are no warns in the log if GPU is misconfigured like this. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9235) If linux container executor is not set for a GPU cluster GpuResourceHandlerImpl is not initialized and NPE is thrown
[ https://issues.apache.org/jira/browse/YARN-9235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16883599#comment-16883599 ] Hadoop QA commented on YARN-9235: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 12s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 29s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 1s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 19s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 1 new + 2 unchanged - 0 fixed = 3 total (was 2) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 14s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 20m 53s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 74m 52s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=18.09.7 Server=18.09.7 Image:yetus/hadoop:bdbca0e53b4 | | JIRA Issue | YARN-9235 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12974478/YARN-9235.004.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 53ed7370f379 4.15.0-48-generic #51-Ubuntu SMP Wed Apr 3 08:28:49 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 738fab3 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_212 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/24384/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/24384/testReport/ | | Max. process+thread count | 307 (vs. ulimit of 5500) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U:
[jira] [Commented] (YARN-9235) If linux container executor is not set for a GPU cluster GpuResourceHandlerImpl is not initialized and NPE is thrown
[ https://issues.apache.org/jira/browse/YARN-9235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16883573#comment-16883573 ] Szilard Nemeth commented on YARN-9235: -- Re-uploading latest patch in order to have a fresh Jenkins result. > If linux container executor is not set for a GPU cluster > GpuResourceHandlerImpl is not initialized and NPE is thrown > > > Key: YARN-9235 > URL: https://issues.apache.org/jira/browse/YARN-9235 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.0, 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Antal Bálint Steinbach >Priority: Major > Attachments: YARN-9235.001.patch, YARN-9235.002.patch, > YARN-9235.003.patch, YARN-9235.004.patch, YARN-9235.004.patch > > > If GPU plugin is enabled for the NodeManager, it is possible to run jobs with > GPU. > However, if LinuxContainerExecutor is not configured, an NPE is thrown when > calling > {code:java} > GpuResourcePlugin.getNMResourceInfo{code} > Also, there are no warns in the log if GPU is misconfigured like this. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9235) If linux container executor is not set for a GPU cluster GpuResourceHandlerImpl is not initialized and NPE is thrown
[ https://issues.apache.org/jira/browse/YARN-9235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16883284#comment-16883284 ] Szilard Nemeth commented on YARN-9235: -- Hi [~bsteinbach]! If you don't mind, [~adam.antal] could take this jira over as this is an important bug. > If linux container executor is not set for a GPU cluster > GpuResourceHandlerImpl is not initialized and NPE is thrown > > > Key: YARN-9235 > URL: https://issues.apache.org/jira/browse/YARN-9235 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.0, 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Antal Bálint Steinbach >Priority: Major > Attachments: YARN-9235.001.patch, YARN-9235.002.patch, > YARN-9235.003.patch, YARN-9235.004.patch > > > If GPU plugin is enabled for the NodeManager, it is possible to run jobs with > GPU. > However, if LinuxContainerExecutor is not configured, an NPE is thrown when > calling > {code:java} > GpuResourcePlugin.getNMResourceInfo{code} > Also, there are no warns in the log if GPU is misconfigured like this. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9235) If linux container executor is not set for a GPU cluster GpuResourceHandlerImpl is not initialized and NPE is thrown
[ https://issues.apache.org/jira/browse/YARN-9235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16812350#comment-16812350 ] Antal Bálint Steinbach commented on YARN-9235: -- Hi [~jojochuang] , as we discussed before when u were at Budapest, can you please review this simple patch, please? Thanks. > If linux container executor is not set for a GPU cluster > GpuResourceHandlerImpl is not initialized and NPE is thrown > > > Key: YARN-9235 > URL: https://issues.apache.org/jira/browse/YARN-9235 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.0, 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Antal Bálint Steinbach >Priority: Major > Attachments: YARN-9235.001.patch, YARN-9235.002.patch, > YARN-9235.003.patch, YARN-9235.004.patch > > > If GPU plugin is enabled for the NodeManager, it is possible to run jobs with > GPU. > However, if LinuxContainerExecutor is not configured, an NPE is thrown when > calling > {code:java} > GpuResourcePlugin.getNMResourceInfo{code} > Also, there are no warns in the log if GPU is misconfigured like this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9235) If linux container executor is not set for a GPU cluster GpuResourceHandlerImpl is not initialized and NPE is thrown
[ https://issues.apache.org/jira/browse/YARN-9235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16803934#comment-16803934 ] Hadoop QA commented on YARN-9235: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 32s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 20s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 1 new + 2 unchanged - 0 fixed = 3 total (was 2) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 13s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 20m 47s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 67m 24s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-9235 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12964024/YARN-9235.004.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 92a3073ea089 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 15d38b1 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_191 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/23825/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/23825/testReport/ | | Max. process+thread count | 446 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U:
[jira] [Commented] (YARN-9235) If linux container executor is not set for a GPU cluster GpuResourceHandlerImpl is not initialized and NPE is thrown
[ https://issues.apache.org/jira/browse/YARN-9235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16803896#comment-16803896 ] Antal Bálint Steinbach commented on YARN-9235: -- [~snemeth] , it already has a +1, thanks > If linux container executor is not set for a GPU cluster > GpuResourceHandlerImpl is not initialized and NPE is thrown > > > Key: YARN-9235 > URL: https://issues.apache.org/jira/browse/YARN-9235 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.0, 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Antal Bálint Steinbach >Priority: Major > Attachments: YARN-9235.001.patch, YARN-9235.002.patch, > YARN-9235.003.patch, YARN-9235.004.patch > > > If GPU plugin is enabled for the NodeManager, it is possible to run jobs with > GPU. > However, if LinuxContainerExecutor is not configured, an NPE is thrown when > calling > {code:java} > GpuResourcePlugin.getNMResourceInfo{code} > Also, there are no warns in the log if GPU is misconfigured like this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9235) If linux container executor is not set for a GPU cluster GpuResourceHandlerImpl is not initialized and NPE is thrown
[ https://issues.apache.org/jira/browse/YARN-9235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16803882#comment-16803882 ] Antal Bálint Steinbach commented on YARN-9235: -- Hi, LOGGER is renamed to LOG. I agree that testing general exceptions need to be checked, but testing its error message is not the way to do that for several reasons. (parameters in the text, i18n, etc..) I do not want to change the exception itself, as this was a small refactor and NPE fix. So I think this small tradeoff is acceptable in the test. [~sunilg], [~tangzhankun] can you please push? > If linux container executor is not set for a GPU cluster > GpuResourceHandlerImpl is not initialized and NPE is thrown > > > Key: YARN-9235 > URL: https://issues.apache.org/jira/browse/YARN-9235 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.0, 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Antal Bálint Steinbach >Priority: Major > Attachments: YARN-9235.001.patch, YARN-9235.002.patch, > YARN-9235.003.patch, YARN-9235.004.patch > > > If GPU plugin is enabled for the NodeManager, it is possible to run jobs with > GPU. > However, if LinuxContainerExecutor is not configured, an NPE is thrown when > calling > {code:java} > GpuResourcePlugin.getNMResourceInfo{code} > Also, there are no warns in the log if GPU is misconfigured like this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9235) If linux container executor is not set for a GPU cluster GpuResourceHandlerImpl is not initialized and NPE is thrown
[ https://issues.apache.org/jira/browse/YARN-9235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16803881#comment-16803881 ] Szilard Nemeth commented on YARN-9235: -- Hi [~bsteinbach]! Do you need help with the review? > If linux container executor is not set for a GPU cluster > GpuResourceHandlerImpl is not initialized and NPE is thrown > > > Key: YARN-9235 > URL: https://issues.apache.org/jira/browse/YARN-9235 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.0, 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Antal Bálint Steinbach >Priority: Major > Attachments: YARN-9235.001.patch, YARN-9235.002.patch, > YARN-9235.003.patch, YARN-9235.004.patch > > > If GPU plugin is enabled for the NodeManager, it is possible to run jobs with > GPU. > However, if LinuxContainerExecutor is not configured, an NPE is thrown when > calling > {code:java} > GpuResourcePlugin.getNMResourceInfo{code} > Also, there are no warns in the log if GPU is misconfigured like this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9235) If linux container executor is not set for a GPU cluster GpuResourceHandlerImpl is not initialized and NPE is thrown
[ https://issues.apache.org/jira/browse/YARN-9235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16792736#comment-16792736 ] Hadoop QA commented on YARN-9235: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 29s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 55s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 23s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 1 new + 2 unchanged - 0 fixed = 3 total (was 2) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 12s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 21m 21s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 26s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 68m 22s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-9235 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12962482/YARN-9235.003.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux e700776a5344 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 983b78a | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_191 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/23707/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/23707/testReport/ | | Max. process+thread count | 411 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U:
[jira] [Commented] (YARN-9235) If linux container executor is not set for a GPU cluster GpuResourceHandlerImpl is not initialized and NPE is thrown
[ https://issues.apache.org/jira/browse/YARN-9235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16787844#comment-16787844 ] Adam Antal commented on YARN-9235: -- It seems that the depending patches all got resolved, so this one can go in. Could you please check the following items: * Could you modify the log object name from {{LOGGER}} to {{LOG}}, as in this module most of those log objects are named that (see YARN-7047 to see that convention). * Using {{(expected = YarnException.class)}} in {{testResourceHandlerNotInitialized}} is a good idea, but I'd rather be more precise on that, as YarnException is too general and it is possible this test still passes if any other YarnException is thrown - which is not the expected behaviour. Consider checking whether the cause of the exception contains/equals the new error message you provided in this patch. * To take a step further you can make that error message a static package-private class variable with @VisibleForTesting annotation, and you can reference that from the test. * Also probably a rebase is needed as those tests modified those files. > If linux container executor is not set for a GPU cluster > GpuResourceHandlerImpl is not initialized and NPE is thrown > > > Key: YARN-9235 > URL: https://issues.apache.org/jira/browse/YARN-9235 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.0, 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Antal Bálint Steinbach >Priority: Major > Attachments: YARN-9235.001.patch, YARN-9235.002.patch, > YARN-9235.003.patch > > > If GPU plugin is enabled for the NodeManager, it is possible to run jobs with > GPU. > However, if LinuxContainerExecutor is not configured, an NPE is thrown when > calling > {code:java} > GpuResourcePlugin.getNMResourceInfo{code} > Also, there are no warns in the log if GPU is misconfigured like this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9235) If linux container executor is not set for a GPU cluster GpuResourceHandlerImpl is not initialized and NPE is thrown
[ https://issues.apache.org/jira/browse/YARN-9235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16779061#comment-16779061 ] Zoltan Siegl commented on YARN-9235: LGTM +1 (non-binding) > If linux container executor is not set for a GPU cluster > GpuResourceHandlerImpl is not initialized and NPE is thrown > > > Key: YARN-9235 > URL: https://issues.apache.org/jira/browse/YARN-9235 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.0, 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Antal Bálint Steinbach >Priority: Major > Attachments: YARN-9235.001.patch, YARN-9235.002.patch, > YARN-9235.003.patch > > > If GPU plugin is enabled for the NodeManager, it is possible to run jobs with > GPU. > However, if LinuxContainerExecutor is not configured, an NPE is thrown when > calling > {code:java} > GpuResourcePlugin.getNMResourceInfo{code} > Also, there are no warns in the log if GPU is misconfigured like this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9235) If linux container executor is not set for a GPU cluster GpuResourceHandlerImpl is not initialized and NPE is thrown
[ https://issues.apache.org/jira/browse/YARN-9235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16777812#comment-16777812 ] Hadoop QA commented on YARN-9235: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 46s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 19s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 1 new + 2 unchanged - 0 fixed = 3 total (was 2) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 30s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 21m 20s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 28s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 71m 59s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-9235 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12960161/YARN-9235.003.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 7a515c702021 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 59ba355 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_191 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/23541/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/23541/testReport/ | | Max. process+thread count | 412 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U:
[jira] [Commented] (YARN-9235) If linux container executor is not set for a GPU cluster GpuResourceHandlerImpl is not initialized and NPE is thrown
[ https://issues.apache.org/jira/browse/YARN-9235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16777105#comment-16777105 ] Hadoop QA commented on YARN-9235: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 24s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 36s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 7s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 26s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 1 new + 2 unchanged - 0 fixed = 3 total (was 2) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 7 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 29s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 20m 42s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 29s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 75m 48s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-9235 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12960049/YARN-9235.002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 25263e3622e2 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 6cec906 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_191 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/23532/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | whitespace | https://builds.apache.org/job/PreCommit-YARN-Build/23532/artifact/out/whitespace-eol.txt | | Test Results |
[jira] [Commented] (YARN-9235) If linux container executor is not set for a GPU cluster GpuResourceHandlerImpl is not initialized and NPE is thrown
[ https://issues.apache.org/jira/browse/YARN-9235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16777016#comment-16777016 ] Antal Bálint Steinbach commented on YARN-9235: -- Hi [~sunilg] , Tests are added as requested. > If linux container executor is not set for a GPU cluster > GpuResourceHandlerImpl is not initialized and NPE is thrown > > > Key: YARN-9235 > URL: https://issues.apache.org/jira/browse/YARN-9235 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.0, 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Antal Bálint Steinbach >Priority: Major > Attachments: YARN-9235.001.patch, YARN-9235.002.patch > > > If GPU plugin is enabled for the NodeManager, it is possible to run jobs with > GPU. > However, if LinuxContainerExecutor is not configured, an NPE is thrown when > calling > {code:java} > GpuResourcePlugin.getNMResourceInfo{code} > Also, there are no warns in the log if GPU is misconfigured like this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9235) If linux container executor is not set for a GPU cluster GpuResourceHandlerImpl is not initialized and NPE is thrown
[ https://issues.apache.org/jira/browse/YARN-9235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16776011#comment-16776011 ] Szilard Nemeth commented on YARN-9235: -- [~sunilg]: Just an FYI, This depends on YARN-9121 and not YARN-9118. YARN-9118 was referred by mistake. > If linux container executor is not set for a GPU cluster > GpuResourceHandlerImpl is not initialized and NPE is thrown > > > Key: YARN-9235 > URL: https://issues.apache.org/jira/browse/YARN-9235 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.0, 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Antal Bálint Steinbach >Priority: Major > Attachments: YARN-9235.001.patch > > > If GPU plugin is enabled for the NodeManager, it is possible to run jobs with > GPU. > However, if LinuxContainerExecutor is not configured, an NPE is thrown when > calling > {code:java} > GpuResourcePlugin.getNMResourceInfo{code} > Also, there are no warns in the log if GPU is misconfigured like this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9235) If linux container executor is not set for a GPU cluster GpuResourceHandlerImpl is not initialized and NPE is thrown
[ https://issues.apache.org/jira/browse/YARN-9235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775202#comment-16775202 ] Sunil Govindan commented on YARN-9235: -- Thanks [~bsteinbach]. Make sense. Lets visit and review (YARN-9118, YARN-9213) and come back here. > If linux container executor is not set for a GPU cluster > GpuResourceHandlerImpl is not initialized and NPE is thrown > > > Key: YARN-9235 > URL: https://issues.apache.org/jira/browse/YARN-9235 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.0, 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Antal Bálint Steinbach >Priority: Major > Attachments: YARN-9235.001.patch > > > If GPU plugin is enabled for the NodeManager, it is possible to run jobs with > GPU. > However, if LinuxContainerExecutor is not configured, an NPE is thrown when > calling > {code:java} > GpuResourcePlugin.getNMResourceInfo{code} > Also, there are no warns in the log if GPU is misconfigured like this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9235) If linux container executor is not set for a GPU cluster GpuResourceHandlerImpl is not initialized and NPE is thrown
[ https://issues.apache.org/jira/browse/YARN-9235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769357#comment-16769357 ] Antal Bálint Steinbach commented on YARN-9235: -- Hi [~pbacsko], [~sunilg] , Yeah, that would be great but unfortunately, it is not so easy. This class is not prepared to be testable. _GpuDiscoverer.getInstance().getGpuDeviceInformation()_ will throw exception before we reach the code we would like to test. Me and [~snemeth] has some patches available to address this issue. I would not do the same change in the 3rd patch for this. (add GpuDiscoverer as a dependency for the class) YARN-9217 for example has a test which is testing this method. Unfortunately, this issue is blocked by [~snemeth]'s other pending commits (YARN-9118, YARN-9213), because they are conflicting badly. I would recommend submitting those first then I merge my issues or submit this without test and I resolve the problem with the other mentioned issues. > If linux container executor is not set for a GPU cluster > GpuResourceHandlerImpl is not initialized and NPE is thrown > > > Key: YARN-9235 > URL: https://issues.apache.org/jira/browse/YARN-9235 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.0, 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Antal Bálint Steinbach >Priority: Major > Attachments: YARN-9235.001.patch > > > If GPU plugin is enabled for the NodeManager, it is possible to run jobs with > GPU. > However, if LinuxContainerExecutor is not configured, an NPE is thrown when > calling > {code:java} > GpuResourcePlugin.getNMResourceInfo{code} > Also, there are no warns in the log if GPU is misconfigured like this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9235) If linux container executor is not set for a GPU cluster GpuResourceHandlerImpl is not initialized and NPE is thrown
[ https://issues.apache.org/jira/browse/YARN-9235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16768463#comment-16768463 ] Hadoop QA commented on YARN-9235: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 25s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 24s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 8s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s{color} | {color:green} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 0 new + 2 unchanged - 2 fixed = 2 total (was 4) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 52s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 20m 47s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 28s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 75m 1s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-9235 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12958728/YARN-9235.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 2ff208401cac 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 7a57974 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_191 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/23407/testReport/ | | Max. process+thread count | 306 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U:
[jira] [Commented] (YARN-9235) If linux container executor is not set for a GPU cluster GpuResourceHandlerImpl is not initialized and NPE is thrown
[ https://issues.apache.org/jira/browse/YARN-9235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16768438#comment-16768438 ] Sunil Govindan commented on YARN-9235: -- Yes. I agree to [~pbacsko] Pls help to add a test case here. Other than that, approach seems fine. > If linux container executor is not set for a GPU cluster > GpuResourceHandlerImpl is not initialized and NPE is thrown > > > Key: YARN-9235 > URL: https://issues.apache.org/jira/browse/YARN-9235 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.0, 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Antal Bálint Steinbach >Priority: Major > Attachments: YARN-9235.001.patch > > > If GPU plugin is enabled for the NodeManager, it is possible to run jobs with > GPU. > However, if LinuxContainerExecutor is not configured, an NPE is thrown when > calling > {code:java} > GpuResourcePlugin.getNMResourceInfo{code} > Also, there are no warns in the log if GPU is misconfigured like this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9235) If linux container executor is not set for a GPU cluster GpuResourceHandlerImpl is not initialized and NPE is thrown
[ https://issues.apache.org/jira/browse/YARN-9235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16768325#comment-16768325 ] Antal Bálint Steinbach commented on YARN-9235: -- Hi [~sunilg] , I uploaded a very simple patch. > If linux container executor is not set for a GPU cluster > GpuResourceHandlerImpl is not initialized and NPE is thrown > > > Key: YARN-9235 > URL: https://issues.apache.org/jira/browse/YARN-9235 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.0, 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Antal Bálint Steinbach >Priority: Major > Attachments: YARN-9235.001.patch > > > If GPU plugin is enabled for the NodeManager, it is possible to run jobs with > GPU. > However, if LinuxContainerExecutor is not configured, an NPE is thrown when > calling > {code:java} > GpuResourcePlugin.getNMResourceInfo{code} > Also, there are no warns in the log if GPU is misconfigured like this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9235) If linux container executor is not set for a GPU cluster GpuResourceHandlerImpl is not initialized and NPE is thrown
[ https://issues.apache.org/jira/browse/YARN-9235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16768331#comment-16768331 ] Peter Bacsko commented on YARN-9235: [~bsteinbach] can you add a simple unit test for this scenario? > If linux container executor is not set for a GPU cluster > GpuResourceHandlerImpl is not initialized and NPE is thrown > > > Key: YARN-9235 > URL: https://issues.apache.org/jira/browse/YARN-9235 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.0, 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Antal Bálint Steinbach >Priority: Major > Attachments: YARN-9235.001.patch > > > If GPU plugin is enabled for the NodeManager, it is possible to run jobs with > GPU. > However, if LinuxContainerExecutor is not configured, an NPE is thrown when > calling > {code:java} > GpuResourcePlugin.getNMResourceInfo{code} > Also, there are no warns in the log if GPU is misconfigured like this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9235) If linux container executor is not set for a GPU cluster GpuResourceHandlerImpl is not initialized and NPE is thrown
[ https://issues.apache.org/jira/browse/YARN-9235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16763722#comment-16763722 ] Sunil Govindan commented on YARN-9235: -- Its a good point. Thanks [~bsteinbach] Could we add a validation to block this ? > If linux container executor is not set for a GPU cluster > GpuResourceHandlerImpl is not initialized and NPE is thrown > > > Key: YARN-9235 > URL: https://issues.apache.org/jira/browse/YARN-9235 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 3.0.0, 3.1.0 >Reporter: Antal Bálint Steinbach >Assignee: Antal Bálint Steinbach >Priority: Major > > If GPU plugin is enabled for the NodeManager, it is possible to run jobs with > GPU. > However, if LinuxContainerExecutor is not configured, an NPE is thrown when > calling > {code:java} > GpuResourcePlugin.getNMResourceInfo{code} > Also, there are no warns in the log if GPU is misconfigured like this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org