Bjorn, I don't know if you're still experimenting with Myriad, but I
believe I've got a fix for your issue.  I'm going to try to get it in our
next release, so if you have any feedback it would be great.  I verified it
on a couple small systems.

https://github.com/apache/incubator-myriad/pull/69

On Wed, Mar 23, 2016 at 8:17 AM, Darin Johnson <dbjohnson1...@gmail.com>
wrote:

> Hey, Bjorn sorry for the delay, looking at the difference between the
> exceptions and my own experience I believe you left some cgroup configs in
> yarn-site.xml of the node manager.
> On Mar 18, 2016 2:58 AM, "Björn Hagemeier" <b.hageme...@fz-juelich.de>
> wrote:
>
>> Hi Darin,
>>
>> thanks a lot for this. But what about the other case below, when cgroups
>> is disabled?
>>
>>
>> Björn
>>
>> Am 18.03.2016 um 00:25 schrieb Darin Johnson:
>> > Hey Bjorn,
>> >
>> > I think I figured out the issue.  Some of the values for cgroups are
>> still
>> > hardcoded in myriad.  I'll add a JIRA Ticket hopefully we can get an
>> update
>> > for 0.2.0.  I'll also respond to this thread after a pull request is
>> > submitted in case you'd like to test it.
>> >
>> > Darin
>> > Hi all,
>> >
>> > I have trouble starting the NM on the slave nodes. Apparently, it does
>> > not find it's configuration or sth. is wrong with the configuration.
>> >
>> > With cgroups enabled, the NM does not start, the logs contain,
>> > indicating that there is sth. wrong in the configuratin. However,
>> > yarn.nodemanager.linux-container-executor.group is set (to "yarn"). The
>> > value used to be "${yarn.nodemanager.linux-container-executor.group}" as
>> > indicated by the installation documentation, however I'm uncertain
>> > whether this recursion is the correct approach.
>> >
>> >
>> > ==================================================
>> > 16/03/14 09:32:45 FATAL nodemanager.NodeManager: Error starting
>> NodeManager
>> > org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to
>> > initialize container executor
>> >         at
>> >
>> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:213)
>> >         at
>> > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>> >         at
>> >
>> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:474)
>> >         at
>> >
>> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:521)
>> > Caused by: java.io.IOException: Linux container executor not configured
>> > properly (error=24)
>> >         at
>> >
>> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:193)
>> >         at
>> >
>> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:211)
>> >         ... 3 more
>> > Caused by: ExitCodeException exitCode=24: Can't get configured value for
>> > yarn.nodemanager.linux-container-executor.group.
>> >
>> >         at org.apache.hadoop.util.Shell.runCommand(Shell.java:543)
>> >         at org.apache.hadoop.util.Shell.run(Shell.java:460)
>> >         at
>> >
>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:720)
>> >         at
>> >
>> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:187)
>> >         ... 4 more
>> > ==================================================
>> >
>> >
>> > I have given it another try with cgroups disabled (in
>> > myriad-config-default.yml), I seem to get a little further, but still
>> > stuck at running Yarn jobs:
>> >
>> > ==================================================
>> > 16/03/14 10:56:34 INFO container.Container: Container
>> > container_1457949199710_0001_01_000001 transitioned from LOCALIZED to
>> > RUNNING
>> > 16/03/14 10:56:34 INFO nodemanager.DefaultContainerExecutor:
>> > launchContainer: [bash,
>> >
>> /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/bjoernh/appcache/application_1457949199710_0001/container_1457949199710_0001_01_000001/default_container_executor.sh]
>> > 16/03/14 10:56:34 WARN nodemanager.DefaultContainerExecutor: Exit code
>> > from container container_1457949199710_0001_01_000001 is : 1
>> > 16/03/14 10:56:34 WARN nodemanager.DefaultContainerExecutor: Exception
>> > from container-launch with container ID:
>> > container_1457949199710_0001_01_000001 and exit code: 1
>> > ExitCodeException exitCode=1:
>> >         at org.apache.hadoop.util.Shell.runCommand(Shell.java:543)
>> >         at org.apache.hadoop.util.Shell.run(Shell.java:460)
>> >         at
>> >
>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:720)
>> >         at
>> >
>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:210)
>> >         at
>> >
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
>> >         at
>> >
>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
>> >         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>> >         at
>> >
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>> >         at
>> >
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>> >         at java.lang.Thread.run(Thread.java:745)
>> > 16/03/14 10:56:34 INFO nodemanager.ContainerExecutor: Exception from
>> > container-launch.
>> > 16/03/14 10:56:34 INFO nodemanager.ContainerExecutor: Container id:
>> > container_1457949199710_0001_01_000001
>> > 16/03/14 10:56:34 INFO nodemanager.ContainerExecutor: Exit code: 1
>> > ==================================================
>> >
>> > Unfortunately, directory
>> > /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/bjoernh/appcache/
>> > is empty, the log indicates that it is being deleted after the failed
>> > attempt.
>> >
>> > Again, any hint would be useful. Also regarding the activation of
>> cgroups.
>> >
>> >
>> > Best regards,
>> > Björn
>> >
>> > --
>> > Dipl.-Inform. Björn Hagemeier
>> > Federated Systems and Data
>> > Juelich Supercomputing Centre
>> > Institute for Advanced Simulation
>> >
>> > Phone: +49 2461 61 1584
>> > Fax  : +49 2461 61 6656
>> > Email: b.hageme...@fz-juelich.de
>> > Skype: bhagemeier
>> > WWW  : http://www.fz-juelich.de/jsc
>> >
>> > JSC is the coordinator of the
>> > John von Neumann Institute for Computing
>> > and member of the
>> > Gauss Centre for Supercomputing
>> >
>> >
>> -------------------------------------------------------------------------------------
>> >
>> -------------------------------------------------------------------------------------
>> > Forschungszentrum Juelich GmbH
>> > 52425 Juelich
>> > Sitz der Gesellschaft: Juelich
>> > Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
>> > Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
>> > Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
>> > Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
>> > Prof. Dr. Sebastian M. Schmidt
>> >
>> -------------------------------------------------------------------------------------
>> >
>> -------------------------------------------------------------------------------------
>> >
>>
>>
>> --
>> Dipl.-Inform. Björn Hagemeier
>> Federated Systems and Data
>> Juelich Supercomputing Centre
>> Institute for Advanced Simulation
>>
>> Phone: +49 2461 61 1584
>> Fax  : +49 2461 61 6656
>> Email: b.hageme...@fz-juelich.de
>> Skype: bhagemeier
>> WWW  : http://www.fz-juelich.de/jsc
>>
>> JSC is the coordinator of the
>> John von Neumann Institute for Computing
>> and member of the
>> Gauss Centre for Supercomputing
>>
>>
>> -------------------------------------------------------------------------------------
>>
>> -------------------------------------------------------------------------------------
>> Forschungszentrum Juelich GmbH
>> 52425 Juelich
>> Sitz der Gesellschaft: Juelich
>> Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
>> Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
>> Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
>> Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
>> Prof. Dr. Sebastian M. Schmidt
>>
>> -------------------------------------------------------------------------------------
>>
>> -------------------------------------------------------------------------------------
>>
>>

Reply via email to