RE: container is running beyond virtual memory limits

Jordi Blasi Uribarri Mon, 28 Sep 2015 08:49:34 -0700

Just to give an as complete view of my situation I am compiling what I have 
done and what my problem is, so maybe you have the most complete information.


What I have done is the following in two virtual machines, with 4 cores and 4gb 
ram each.
Install Debian 7.8. Plain with no graphical interface.
        apt-get install openjdk-7-jdk openjdk-7-jre git maven curl

        git clone http://git-wip-us.apache.org/repos/asf/samza.git
        gradlew clean build

As there was a bug in the Keyrocks testing script I just commented the code in 
the TestTTL script.

        wget 
http://apache.rediris.es/hadoop/common/hadoop-2.6.0/hadoop-2.6.0.tar.gz
        tar -xvf hadoop-2.6.0.tar.gz

        vi conf/yarn-site.xml
                <configuration>
                <property>
                 <name>yarn.resourcemanager.hostname</name>
                 <value>kfk-samza01</value>
                </property>
                <property>
                <name>yarn.nodemanager.resource.memory-mb</name>
                <value>2048</value>
                </property>
                <property>
                <name>yarn.scheduler.minimum-allocation-mb</name>
                <value>128</value>
                </property>
                <property>
                <name>yarn.nodemanager.resource.cpu-vcores</name>
                <value>3</value>
                </property>
                </configuration>

        cp ./etc/hadoop/capacity-scheduler.xml conf

        vi $HADOOP_YARN_HOME/conf/core-site.xml
                <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
                <configuration>
                    <property>
                      <name>fs.http.impl</name>
                      <value>org.apache.samza.util.hadoop.HttpFileSystem</value>
                    </property>
                </configuration>
                
        curl http://www.scala-lang.org/files/archive/scala-2.10.4.tgz > 
scala-2.10.4.tgz
        tar -xvf scala-2.10.4.tgz       
        cp /tmp/scala-2.10.4/lib/scala-compiler.jar 
$HADOOP_YARN_HOME/share/hadoop/hdfs/lib
        cp /tmp/scala-2.10.4/lib/scala-library.jar 
$HADOOP_YARN_HOME/share/hadoop/hdfs/lib
        curl -L 
http://search.maven.org/remotecontent?filepath=org/clapper/grizzled-slf4j_2.10/1.0.1/grizzled-
  slf4j_2.10-1.0.1.jar > 
$HADOOP_YARN_HOME/share/hadoop/hdfs/lib/grizzled-slf4j_2.10-1.0.1.jar
        curl -L 
http://search.maven.org/remotecontent?filepath=org/apache/samza/samza-yarn_2.10/0.9.1/samza-
    yarn_2.10-0.9.1.jar > 
$HADOOP_YARN_HOME/share/hadoop/hdfs/lib/samza-yarn_2.10-0.9.1.jar
        curl -L 
http://search.maven.org/remotecontent?filepath=org/apache/samza/samza-core_2.10/0.9.1/samza-
            core_2.10-0.9.1.jar > 
$HADOOP_YARN_HOME/share/hadoop/hdfs/lib/samza-core_2.10-0.9.1.jar

        cd /opt/hadoop-2.6.0/
        scp -r . 192.168.15.94:/opt/hadoop-2.6.0
        echo 192.168.15.92 >> conf/slaves
        echo 192.168.15.94 >>  conf/slaves
        sbin/start-yarn.sh

I have copied in the /opt/jobs/bin all the scrips in the 
/opt/samza/samza-shell/src/main/bash/ folder.

I have generated an eclipse project with the samza dependencies included, via 
Maven, and no jobs, package it and copy to /opt/jobs/lib.

I have generated an eclipse project with the samza dependencies included, via 
Maven, and three jobs that implement StreamTask and initiableTask. The 
functions are empty, for testing purposes. It is published in a folder 
published through apache web server.

I have created the associated job options file in the /opt/job/dtan folder like 
this:

task.class=flow.WorkFlow
job.name=flow.WorkFlow
job.factory.class=org.apache.samza.job.yarn.YarnJobFactory
yarn.package.path=http://192.168.15.92/jobs/DataAnalyzer-0.0.1-bin.tar.gz

systems.kafka.samza.factory=org.apache.samza.system.kafka.KafkaSystemFactory
systems.kafka.consumer.zookeeper.connect=kfk-kafka01:2181,kfk-kafka02:2181
systems.kafka.producer.bootstrap.servers=kfk-kafka01:9092,kfk-kafka01:9093,kfk-kafka02:9092,kfk-kafka02:9093
systems.kafka.producer.metadata.broker.list=kfk-kafka01:9092,kfk-kafka01:9093,kfk-kafka02:9092,kfk-kafka02:909

task.checkpoint.factory=org.apache.samza.checkpoint.kafka.KafkaCheckpointManagerFactory
task.checkpoint.system=kafka

task.inputs=kafka.flowtpc

serializers.registry.json.class=org.apache.samza.serializers.JsonSerdeFactory
serializers.registry.string.class=org.apache.samza.serializers.StringSerdeFactory

systems.kafka.samza.msg.serde=string
systems.kafka.streams.tracetpc.samza.msg.serde=json

yarn.container.memory.mb=256
yarn.am.container.memory.mb=256

task.opts= -Xms128M -Xmx128M
task.commit.ms=100

What I see:
        •       If I launch the three jobs, Only one of them gets to running 
state. The one called Router. I it is always the same one. The others stay in 
Accepted until they are killed by the system. I have seen these error:
                o       Container 
[pid=23007,containerID=container_1443454508386_0003_01_000001] is running 
beyond virtual memory limits. Current usage: 13.9 MB of 256 MB physical memory 
used; 1.1 GB of 537.6 MB virtual memory used. Killing container
        •       When I kill the jobs with the kill-yarn-job.sh script the java 
process does not get killed. 
        •       Although I have set in the options  that the job should be 
launched with -Xms128M -Xmx128M I see that it runs with -Xmx768M. I have even 
changed the run-class.sh script but it does not change.

Some things that I am describing do not make sense for me, so I am lost on what 
to do or where to look.

Thanks for your help,

        Jordi



-----Mensaje original-----
De: Jordi Blasi Uribarri [mailto:[email protected]] 
Enviado el: lunes, 28 de septiembre de 2015 11:26
Para: [email protected]
Asunto: RE: container is running beyond virtual memory limits

I just changed the task options file to add the following line:

task.opts=-Xmx128M

And I found no change on the behaivour. I see that the job is being launched 
with the default -Xmx768M value:

root      8296  8294  1 11:16 ?        00:00:05 
/usr/lib/jvm/java-7-openjdk-amd64/bin/java -server 
-Dsamza.container.name=samza-application-master 
-Dsamza.log.dir=/opt/hadoop-2.6.0/logs/userlogs/application_1443431699703_0003/container_1443431699703_0003_01_000001
 
-Djava.io.tmpdir=/tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1443431699703_0003/container_1443431699703_0003_01_000001/__package/tmp
 -Xmx768M -XX:+PrintGCDateStamps 
-Xloggc:/opt/hadoop-2.6.0/logs/userlogs/application_1443431699703_0003/container_1443431699703_0003_01_000001/gc.log
 -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=10241024 
-d64 -cp 
/opt/hadoop-2.6.0/conf:/tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1443431699703_0003/container_1443431699703_0003_01_000001/__package/lib/DataAnalyzer-0.0.1.jar:/tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1443431699703_0003/container_1443431699703_0003_01_000001/__package/lib/DataAnalyzer-0.0.1-jar-with-dependencies.jar:/tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1443431699703_0003/container_1443431699703_0003_01_000001/__package/lib/jackson-annotations-2.6.0.jar:/tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1443431699703_0003/container_1443431699703_0003_01_000001/__package/lib/jackson-core-2.6.0.jar:/tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1443431699703_0003/container_1443431699703_0003_01_000001/__package/lib/jackson-databind-2.6.0.jar:/tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1443431699703_0003/container_1443431699703_0003_01_000001/__package/lib/jackson-dataformat-smile-2.6.0.jar:/tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1443431699703_0003/container_1443431699703_0003_01_000001/__package/lib/jackson-jaxrs-json-provider-2.6.0.jar:/tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_1443431699703_0003/container_1443431699703_0003_01_000001/__package/lib/jackson-module-jaxb-annotations-2.6.0.jar
 org.apache.samza.job.yarn.SamzaAppMaster

How do I set the correct value?

Thanks,

   Jordi

-----Mensaje original-----
De: Yi Pan [mailto:[email protected]] Enviado el: lunes, 28 de septiembre de 
2015 10:56
Para: [email protected]
Asunto: Re: container is running beyond virtual memory limits

Hi, Jordi,

Please find the config variable task.opts in this table:
http://samza.apache.org/learn/documentation/0.9/jobs/configuration-table.html

This allows you to add additional JVM opts when launching the containers.

-Yi

On Mon, Sep 28, 2015 at 1:48 AM, Jordi Blasi Uribarri <[email protected]>
wrote:

> The three tasks have a similar options file, like this one.
>
> task.class=flow.OperationJob
> job.name=flow.OperationJob
> job.factory.class=org.apache.samza.job.yarn.YarnJobFactory
> yarn.package.path=http://IP/javaapp.tar.gz
>
>
> systems.kafka.samza.factory=org.apache.samza.system.kafka.KafkaSystemF
> actory
> systems.kafka.consumer.zookeeper.connect=kfk-kafka01:2181,kfk-kafka02:
> 2181
>
> systems.kafka.producer.bootstrap.servers=kfk-kafka01:9092,kfk-kafka01:
> 9093,kfk-kafka02:9092,kfk-kafka02:9093
>
> systems.kafka.producer.metadata.broker.list=kfk-kafka01:9092,kfk-kafka
> 01:9093,kfk-kafka02:9092,kfk-kafka02:909
>
>
> task.checkpoint.factory=org.apache.samza.checkpoint.kafka.KafkaCheckpo
> intManagerFactory
> task.checkpoint.system=kafka
> task.inputs=kafka.operationtpc
>
>
> serializers.registry.json.class=org.apache.samza.serializers.JsonSerde
> Factory
>
> serializers.registry.string.class=org.apache.samza.serializers.StringS
> erdeFactory
>
> systems.kafka.samza.msg.serde=string
> systems.kafka.streams.tracetpc.samza.msg.serde=json
>
> yarn.container.memory.mb=256
> yarn.am.container.memory.mb=256
>
> task.commit.ms=1000
> task.window.ms=60000
>
> Where do I have to change the XMX parameter?
>
> Thanks.
>
>      Jordi
>
>
> -----Mensaje original-----
> De: Yi Pan [mailto:[email protected]] Enviado el: lunes, 28 de 
> septiembre de 2015 10:39
> Para: [email protected]
> Asunto: Re: container is running beyond virtual memory limits
>
> Hi, Jordi,
>
> Can you post your task.opts settings as well? The Xms and Xmx JVM opts 
> will play a role here as well. The Xmx size should be set to less than 
> yarn.container.memory.mb.
>
> -Yi
>
> On Tue, Sep 22, 2015 at 4:32 AM, Jordi Blasi Uribarri 
> <[email protected]>
> wrote:
>
> > I am seeing that I can not get even a single job running. I have 
> > recovered the original configuration of yarn-site.xml and 
> > capacity-scheduler.xml and that does not work. I am thinking that 
> > maybe there is some kind of information related to old jobs that 
> > have not been correctly cleaned when killing them. Is there any 
> > place where I can look to remove temporary files or something similar?
> >
> > Thanks
> >
> >         jordi
> >
> > -----Mensaje original-----
> > De: Jordi Blasi Uribarri [mailto:[email protected]] Enviado el: 
> > martes,
> > 22 de septiembre de 2015 10:06
> > Para: [email protected]
> > Asunto: container is running beyond virtual memory limits
> >
> > Hi,
> >
> > I am not really sure If this is related to any of the previous 
> > questions so I am asking it in a new message. I am running three 
> > different samza jobs that perform different actions and interchange 
> > information. As I found limits in the memory that were preventing 
> > the jobs to get from Accepted to Running I introduced some 
> > configurations in
> Yarn, as suggested in this list:
> >
> >
> > yarn-site.xml
> >
> > <configuration>
> >   <property>
> >     <name>yarn.scheduler.minimum-allocation-mb</name>
> >     <value>128</value>
> >     <description>Minimum limit of memory to allocate to each 
> > container request at the Resource Manager.</description>
> >   </property>
> >   <property>
> >     <name>yarn.scheduler.maximum-allocation-mb</name>
> >     <value>512</value>
> >     <description>Maximum limit of memory to allocate to each 
> > container request at the Resource Manager.</description>
> >   </property>
> >   <property>
> >     <name>yarn.scheduler.minimum-allocation-vcores</name>
> >     <value>1</value>
> >     <description>The minimum allocation for every container request 
> > at the RM, in terms of virtual CPU cores. Requests lower than this 
> > won't take effect, and the specified value will get allocated the 
> > minimum.</description>
> >   </property>
> >   <property>
> >     <name>yarn.scheduler.maximum-allocation-vcores</name>
> >     <value>2</value>
> >     <description>The maximum allocation for every container request 
> > at the RM, in terms of virtual CPU cores. Requests higher than this 
> > won't take effect, and will get capped to this value.</description>
> >   </property>
> > <property>
> > <name>yarn.resourcemanager.hostname</name>
> > <value>kfk-samza01</value>
> > </property>
> > </configuration>
> >
> > capacity-scheduler.xml
> > Alter value
> >     <property>
> >     <name>yarn.scheduler.capacity.maximum-am-resource-percent</name>
> >     <value>0.5</value>
> >     <description>
> >       Maximum percent of resources in the cluster which can be used 
> > to
> run
> >       application masters i.e. controls number of concurrent running
> >       applications.
> >     </description>
> >   </property>
> >
> > The jobs are configured to reduce the memory usage:
> >
> > yarn.container.memory.mb=256
> > yarn.am.container.memory.mb=256
> >
> > After introducing these changes I experienced a very appreciable 
> > reduction of the speed. It seemed normal as the memory assigned to 
> > the jobs  was lowered and there were more running.  It was running 
> > until yesterday but today I am seeing that
> >
> > What I have seen today is that they are not moving from ACCEPTED to 
> > RUNNING. I have found the following in the log (full log at the end):
> >
> > 2015-09-22 09:54:36,661 INFO  [Container Monitor] 
> > monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(408))
> > - Memory usage of ProcessTree 10346 for container-id
> > container_1442908447829_0001_01_000001: 70.0 MB of 256 MB physical 
> > memory used; 1.2 GB of 537.6 MB virtual memory used
> >
> > I am not sure where that 1.2 Gb comes from and makes the processes dye.
> >
> > Thanks,
> >
> >    Jordi
> >
> >
> >
> >
> > 2015-09-22 09:54:36,519 INFO  [Container Monitor] 
> > monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(458))
> > - Removed ProcessTree with root 10271
> > 2015-09-22 09:54:36,519 INFO  [AsyncDispatcher event handler] 
> > container.Container (ContainerImpl.java:handle(999)) - Container
> > container_1442908447829_0002_01_000001 transitioned from RUNNING to 
> > KILLING
> > 2015-09-22 09:54:36,533 INFO  [AsyncDispatcher event handler] 
> > launcher.ContainerLaunch
> > (ContainerLaunch.java:cleanupContainer(370))
> > - Cleaning up container container_1442908447829_0002_01_000001
> > 2015-09-22 09:54:36,661 INFO  [Container Monitor] 
> > monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(408))
> > - Memory usage of ProcessTree 10346 for container-id
> > container_1442908447829_0001_01_000001: 70.0 MB of 256 MB physical 
> > memory used; 1.2 GB of 537.6 MB virtual memory used
> > 2015-09-22 09:54:36,661 WARN  [Container Monitor] 
> > monitor.ContainersMonitorImpl
> > (ContainersMonitorImpl.java:isProcessTreeOverLimit(293)) - Process 
> > tree for
> > container: container_1442908447829_0001_01_000001 running over twice 
> > the configured limit. Limit=563714432, current usage = 1269743616
> > 2015-09-22 09:54:36,662 WARN  [Container Monitor] 
> > monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(447))
> > - Container
> > [pid=10346,containerID=container_1442908447829_0001_01_000001] is 
> > running beyond virtual memory limits. Current usage: 70.0 MB of 256 
> > MB physical memory used; 1.2 GB of 537.6 MB virtual memory used.
> > Killing
> container.
> > Dump of the process-tree for container_1442908447829_0001_01_000001 :
> >         |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS)
> > SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
> >         |- 10346 10344 10346 10346 (java) 253 7 1269743616 17908 
> > /usr/lib/jvm/java-7-openjdk-amd64/bin/java -server 
> > -Dsamza.container.name=samza-application-master
> > -Dsamza.log.dir=/opt/hadoop-2.6.0/logs/userlogs/application_14429084
> > 47
> > 829_0001/container_1442908447829_0001_01_000001
> > -Djava.io.tmpdir=/tmp/hadoop-root/nm-local-dir/usercache/root/appcac
> > he
> > /application_1442908447829_0001/container_1442908447829_0001_01_0000
> > 01
> > /__package/tmp
> > -Xmx768M -XX:+PrintGCDateStamps
> > -Xloggc:/opt/hadoop-2.6.0/logs/userlogs/application_1442908447829_00
> > 01 /container_1442908447829_0001_01_000001/gc.log
> > -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10
> > -XX:GCLogFileSize=10241024 -d64 -cp
> > /opt/hadoop-2.6.0/conf:/tmp/hadoop-root/nm-local-dir/usercache/root/
> > ap
> > pcache/application_1442908447829_0001/container_1442908447829_0001_0
> > 1_
> > 000001/__package/lib/jackson-annotations-2.6.0.jar:/tmp/hadoop-root/
> > nm
> > -local-dir/usercache/root/appcache/application_1442908447829_0001/co
> > nt
> > ainer_1442908447829_0001_01_000001/__package/lib/jackson-core-2.6.0.
> > ja
> > r:/tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_
> > 14
> > 42908447829_0001/container_1442908447829_0001_01_000001/__package/li
> > b/
> > jackson-databind-2.6.0.jar:/tmp/hadoop-root/nm-local-dir/usercache/r
> > oo
> > t/appcache/application_1442908447829_0001/container_1442908447829_00
> > 01
> > _01_000001/__package/lib/jackson-dataformat-smile-2.6.0.jar:/tmp/had
> > oo
> > p-root/nm-local-dir/usercache/root/appcache/application_144290844782
> > 9_
> > 0001/container_1442908447829_0001_01_000001/__package/lib/jackson-ja
> > xr
> > s-json-provider-2.6.0.jar:/tmp/hadoop-root/nm-local-dir/usercache/ro
> > ot
> > /appcache/application_1442908447829_0001/container_1442908447829_000
> > 1_
> > 01_000001/__package/lib/jackson-module-jaxb-annotations-2.6.0.jar:/t
> > mp
> > /hadoop-root/nm-local-dir/usercache/root/appcache/application_144290
> > 84
> > 47829_0001/container_1442908447829_0001_01_000001/__package/lib/nxtB
> > ro
> > ker-0.0.1.jar:/tmp/hadoop-root/nm-local-dir/usercache/root/appcache/
> > ap
> > plication_1442908447829_0001/container_1442908447829_0001_01_000001/
> > __ package/lib/nxtBroker-0.0.1-jar-with-dependencies.jar
> > org.apache.samza.job.yarn.SamzaAppMaster
> >
> > 2015-09-22 09:54:36,663 INFO  [Container Monitor] 
> > monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(458))
> > - Removed ProcessTree with root 10346
> > 2015-09-22 09:54:36,663 INFO  [AsyncDispatcher event handler] 
> > container.Container (ContainerImpl.java:handle(999)) - Container
> > container_1442908447829_0001_01_000001 transitioned from RUNNING to 
> > KILLING
> > 2015-09-22 09:54:36,663 INFO  [AsyncDispatcher event handler] 
> > launcher.ContainerLaunch
> > (ContainerLaunch.java:cleanupContainer(370))
> > - Cleaning up container container_1442908447829_0001_01_000001
> > ________________________________
> > Jordi Blasi Uribarri
> > Área I+D+i
> >
> > [email protected]
> > Oficina Bilbao
> >
> > [http://www.nextel.es/wp-content/uploads/Firma_Nextel_2015.png]
> > ________________________________
> > Jordi Blasi Uribarri
> >
>

RE: container is running beyond virtual memory limits

Reply via email to