Hi Darin,

thanks a lot for this. But what about the other case below, when cgroups
is disabled?


Björn

Am 18.03.2016 um 00:25 schrieb Darin Johnson:
> Hey Bjorn,
> 
> I think I figured out the issue.  Some of the values for cgroups are still
> hardcoded in myriad.  I'll add a JIRA Ticket hopefully we can get an update
> for 0.2.0.  I'll also respond to this thread after a pull request is
> submitted in case you'd like to test it.
> 
> Darin
> Hi all,
> 
> I have trouble starting the NM on the slave nodes. Apparently, it does
> not find it's configuration or sth. is wrong with the configuration.
> 
> With cgroups enabled, the NM does not start, the logs contain,
> indicating that there is sth. wrong in the configuratin. However,
> yarn.nodemanager.linux-container-executor.group is set (to "yarn"). The
> value used to be "${yarn.nodemanager.linux-container-executor.group}" as
> indicated by the installation documentation, however I'm uncertain
> whether this recursion is the correct approach.
> 
> 
> ==================================================
> 16/03/14 09:32:45 FATAL nodemanager.NodeManager: Error starting NodeManager
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to
> initialize container executor
>         at
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:213)
>         at
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>         at
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:474)
>         at
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:521)
> Caused by: java.io.IOException: Linux container executor not configured
> properly (error=24)
>         at
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:193)
>         at
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:211)
>         ... 3 more
> Caused by: ExitCodeException exitCode=24: Can't get configured value for
> yarn.nodemanager.linux-container-executor.group.
> 
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:543)
>         at org.apache.hadoop.util.Shell.run(Shell.java:460)
>         at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:720)
>         at
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:187)
>         ... 4 more
> ==================================================
> 
> 
> I have given it another try with cgroups disabled (in
> myriad-config-default.yml), I seem to get a little further, but still
> stuck at running Yarn jobs:
> 
> ==================================================
> 16/03/14 10:56:34 INFO container.Container: Container
> container_1457949199710_0001_01_000001 transitioned from LOCALIZED to
> RUNNING
> 16/03/14 10:56:34 INFO nodemanager.DefaultContainerExecutor:
> launchContainer: [bash,
> /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/bjoernh/appcache/application_1457949199710_0001/container_1457949199710_0001_01_000001/default_container_executor.sh]
> 16/03/14 10:56:34 WARN nodemanager.DefaultContainerExecutor: Exit code
> from container container_1457949199710_0001_01_000001 is : 1
> 16/03/14 10:56:34 WARN nodemanager.DefaultContainerExecutor: Exception
> from container-launch with container ID:
> container_1457949199710_0001_01_000001 and exit code: 1
> ExitCodeException exitCode=1:
>         at org.apache.hadoop.util.Shell.runCommand(Shell.java:543)
>         at org.apache.hadoop.util.Shell.run(Shell.java:460)
>         at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:720)
>         at
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:210)
>         at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
>         at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> 16/03/14 10:56:34 INFO nodemanager.ContainerExecutor: Exception from
> container-launch.
> 16/03/14 10:56:34 INFO nodemanager.ContainerExecutor: Container id:
> container_1457949199710_0001_01_000001
> 16/03/14 10:56:34 INFO nodemanager.ContainerExecutor: Exit code: 1
> ==================================================
> 
> Unfortunately, directory
> /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/bjoernh/appcache/
> is empty, the log indicates that it is being deleted after the failed
> attempt.
> 
> Again, any hint would be useful. Also regarding the activation of cgroups.
> 
> 
> Best regards,
> Björn
> 
> --
> Dipl.-Inform. Björn Hagemeier
> Federated Systems and Data
> Juelich Supercomputing Centre
> Institute for Advanced Simulation
> 
> Phone: +49 2461 61 1584
> Fax  : +49 2461 61 6656
> Email: b.hageme...@fz-juelich.de
> Skype: bhagemeier
> WWW  : http://www.fz-juelich.de/jsc
> 
> JSC is the coordinator of the
> John von Neumann Institute for Computing
> and member of the
> Gauss Centre for Supercomputing
> 
> -------------------------------------------------------------------------------------
> -------------------------------------------------------------------------------------
> Forschungszentrum Juelich GmbH
> 52425 Juelich
> Sitz der Gesellschaft: Juelich
> Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
> Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
> Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
> Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
> Prof. Dr. Sebastian M. Schmidt
> -------------------------------------------------------------------------------------
> -------------------------------------------------------------------------------------
> 


-- 
Dipl.-Inform. Björn Hagemeier
Federated Systems and Data
Juelich Supercomputing Centre
Institute for Advanced Simulation

Phone: +49 2461 61 1584
Fax  : +49 2461 61 6656
Email: b.hageme...@fz-juelich.de
Skype: bhagemeier
WWW  : http://www.fz-juelich.de/jsc

JSC is the coordinator of the
John von Neumann Institute for Computing
and member of the
Gauss Centre for Supercomputing

-------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt
-------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------

<<attachment: b_hagemeier.vcf>>

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to