RE: container is running beyond virtual memory limits

Jordi Blasi Uribarri Mon, 28 Sep 2015 01:49:38 -0700

The three tasks have a similar options file, like this one.

task.class=flow.OperationJob
job.name=flow.OperationJob
job.factory.class=org.apache.samza.job.yarn.YarnJobFactory
yarn.package.path=http://IP/javaapp.tar.gz


systems.kafka.samza.factory=org.apache.samza.system.kafka.KafkaSystemFactory
systems.kafka.consumer.zookeeper.connect=kfk-kafka01:2181,kfk-kafka02:2181
systems.kafka.producer.bootstrap.servers=kfk-kafka01:9092,kfk-kafka01:9093,kfk-kafka02:9092,kfk-kafka02:9093
systems.kafka.producer.metadata.broker.list=kfk-kafka01:9092,kfk-kafka01:9093,kfk-kafka02:9092,kfk-kafka02:909

task.checkpoint.factory=org.apache.samza.checkpoint.kafka.KafkaCheckpointManagerFactory
task.checkpoint.system=kafka
task.inputs=kafka.operationtpc

serializers.registry.json.class=org.apache.samza.serializers.JsonSerdeFactory
serializers.registry.string.class=org.apache.samza.serializers.StringSerdeFactory

systems.kafka.samza.msg.serde=string
systems.kafka.streams.tracetpc.samza.msg.serde=json

yarn.container.memory.mb=256
yarn.am.container.memory.mb=256

task.commit.ms=1000
task.window.ms=60000

Where do I have to change the XMX parameter?

Thanks.

     Jordi


-----Mensaje original-----
De: Yi Pan [mailto:[email protected]] 
Enviado el: lunes, 28 de septiembre de 2015 10:39
Para: [email protected]
Asunto: Re: container is running beyond virtual memory limits

Hi, Jordi,

Can you post your task.opts settings as well? The Xms and Xmx JVM opts will 
play a role here as well. The Xmx size should be set to less than 
yarn.container.memory.mb.

-Yi

On Tue, Sep 22, 2015 at 4:32 AM, Jordi Blasi Uribarri <[email protected]>
wrote:

> I am seeing that I can not get even a single job running. I have 
> recovered the original configuration of yarn-site.xml and 
> capacity-scheduler.xml and that does not work. I am thinking that 
> maybe there is some kind of information related to old jobs that have 
> not been correctly cleaned when killing them. Is there any place where 
> I can look to remove temporary files or something similar?
>
> Thanks
>
>         jordi
>
> -----Mensaje original-----
> De: Jordi Blasi Uribarri [mailto:[email protected]] Enviado el: martes, 
> 22 de septiembre de 2015 10:06
> Para: [email protected]
> Asunto: container is running beyond virtual memory limits
>
> Hi,
>
> I am not really sure If this is related to any of the previous 
> questions so I am asking it in a new message. I am running three 
> different samza jobs that perform different actions and interchange 
> information. As I found limits in the memory that were preventing the 
> jobs to get from Accepted to Running I introduced some configurations in 
> Yarn, as suggested in this list:
>
>
> yarn-site.xml
>
> <configuration>
>   <property>
>     <name>yarn.scheduler.minimum-allocation-mb</name>
>     <value>128</value>
>     <description>Minimum limit of memory to allocate to each container 
> request at the Resource Manager.</description>
>   </property>
>   <property>
>     <name>yarn.scheduler.maximum-allocation-mb</name>
>     <value>512</value>
>     <description>Maximum limit of memory to allocate to each container 
> request at the Resource Manager.</description>
>   </property>
>   <property>
>     <name>yarn.scheduler.minimum-allocation-vcores</name>
>     <value>1</value>
>     <description>The minimum allocation for every container request at 
> the RM, in terms of virtual CPU cores. Requests lower than this won't 
> take effect, and the specified value will get allocated the 
> minimum.</description>
>   </property>
>   <property>
>     <name>yarn.scheduler.maximum-allocation-vcores</name>
>     <value>2</value>
>     <description>The maximum allocation for every container request at 
> the RM, in terms of virtual CPU cores. Requests higher than this won't 
> take effect, and will get capped to this value.</description>
>   </property>
> <property>
> <name>yarn.resourcemanager.hostname</name>
> <value>kfk-samza01</value>
> </property>
> </configuration>
>
> capacity-scheduler.xml
> Alter value
>     <property>
>     <name>yarn.scheduler.capacity.maximum-am-resource-percent</name>
>     <value>0.5</value>
>     <description>
>       Maximum percent of resources in the cluster which can be used to run
>       application masters i.e. controls number of concurrent running
>       applications.
>     </description>
>   </property>
>
> The jobs are configured to reduce the memory usage:
>
> yarn.container.memory.mb=256
> yarn.am.container.memory.mb=256
>
> After introducing these changes I experienced a very appreciable 
> reduction of the speed. It seemed normal as the memory assigned to the 
> jobs  was lowered and there were more running.  It was running until 
> yesterday but today I am seeing that
>
> What I have seen today is that they are not moving from ACCEPTED to 
> RUNNING. I have found the following in the log (full log at the end):
>
> 2015-09-22 09:54:36,661 INFO  [Container Monitor] 
> monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(408)) - 
> Memory usage of ProcessTree 10346 for container-id
> container_1442908447829_0001_01_000001: 70.0 MB of 256 MB physical 
> memory used; 1.2 GB of 537.6 MB virtual memory used
>
> I am not sure where that 1.2 Gb comes from and makes the processes dye.
>
> Thanks,
>
>    Jordi
>
>
>
>
> 2015-09-22 09:54:36,519 INFO  [Container Monitor] 
> monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(458)) - 
> Removed ProcessTree with root 10271
> 2015-09-22 09:54:36,519 INFO  [AsyncDispatcher event handler] 
> container.Container (ContainerImpl.java:handle(999)) - Container
> container_1442908447829_0002_01_000001 transitioned from RUNNING to 
> KILLING
> 2015-09-22 09:54:36,533 INFO  [AsyncDispatcher event handler] 
> launcher.ContainerLaunch (ContainerLaunch.java:cleanupContainer(370)) 
> - Cleaning up container container_1442908447829_0002_01_000001
> 2015-09-22 09:54:36,661 INFO  [Container Monitor] 
> monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(408)) - 
> Memory usage of ProcessTree 10346 for container-id
> container_1442908447829_0001_01_000001: 70.0 MB of 256 MB physical 
> memory used; 1.2 GB of 537.6 MB virtual memory used
> 2015-09-22 09:54:36,661 WARN  [Container Monitor] 
> monitor.ContainersMonitorImpl
> (ContainersMonitorImpl.java:isProcessTreeOverLimit(293)) - Process 
> tree for
> container: container_1442908447829_0001_01_000001 running over twice 
> the configured limit. Limit=563714432, current usage = 1269743616
> 2015-09-22 09:54:36,662 WARN  [Container Monitor] 
> monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(447)) - 
> Container 
> [pid=10346,containerID=container_1442908447829_0001_01_000001] is 
> running beyond virtual memory limits. Current usage: 70.0 MB of 256 MB 
> physical memory used; 1.2 GB of 537.6 MB virtual memory used. Killing 
> container.
> Dump of the process-tree for container_1442908447829_0001_01_000001 :
>         |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS)
> SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
>         |- 10346 10344 10346 10346 (java) 253 7 1269743616 17908 
> /usr/lib/jvm/java-7-openjdk-amd64/bin/java -server 
> -Dsamza.container.name=samza-application-master
> -Dsamza.log.dir=/opt/hadoop-2.6.0/logs/userlogs/application_1442908447
> 829_0001/container_1442908447829_0001_01_000001
> -Djava.io.tmpdir=/tmp/hadoop-root/nm-local-dir/usercache/root/appcache
> /application_1442908447829_0001/container_1442908447829_0001_01_000001
> /__package/tmp
> -Xmx768M -XX:+PrintGCDateStamps
> -Xloggc:/opt/hadoop-2.6.0/logs/userlogs/application_1442908447829_0001
> /container_1442908447829_0001_01_000001/gc.log
> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10
> -XX:GCLogFileSize=10241024 -d64 -cp
> /opt/hadoop-2.6.0/conf:/tmp/hadoop-root/nm-local-dir/usercache/root/ap
> pcache/application_1442908447829_0001/container_1442908447829_0001_01_
> 000001/__package/lib/jackson-annotations-2.6.0.jar:/tmp/hadoop-root/nm
> -local-dir/usercache/root/appcache/application_1442908447829_0001/cont
> ainer_1442908447829_0001_01_000001/__package/lib/jackson-core-2.6.0.ja
> r:/tmp/hadoop-root/nm-local-dir/usercache/root/appcache/application_14
> 42908447829_0001/container_1442908447829_0001_01_000001/__package/lib/
> jackson-databind-2.6.0.jar:/tmp/hadoop-root/nm-local-dir/usercache/roo
> t/appcache/application_1442908447829_0001/container_1442908447829_0001
> _01_000001/__package/lib/jackson-dataformat-smile-2.6.0.jar:/tmp/hadoo
> p-root/nm-local-dir/usercache/root/appcache/application_1442908447829_
> 0001/container_1442908447829_0001_01_000001/__package/lib/jackson-jaxr
> s-json-provider-2.6.0.jar:/tmp/hadoop-root/nm-local-dir/usercache/root
> /appcache/application_1442908447829_0001/container_1442908447829_0001_
> 01_000001/__package/lib/jackson-module-jaxb-annotations-2.6.0.jar:/tmp
> /hadoop-root/nm-local-dir/usercache/root/appcache/application_14429084
> 47829_0001/container_1442908447829_0001_01_000001/__package/lib/nxtBro
> ker-0.0.1.jar:/tmp/hadoop-root/nm-local-dir/usercache/root/appcache/ap
> plication_1442908447829_0001/container_1442908447829_0001_01_000001/__
> package/lib/nxtBroker-0.0.1-jar-with-dependencies.jar
> org.apache.samza.job.yarn.SamzaAppMaster
>
> 2015-09-22 09:54:36,663 INFO  [Container Monitor] 
> monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(458)) - 
> Removed ProcessTree with root 10346
> 2015-09-22 09:54:36,663 INFO  [AsyncDispatcher event handler] 
> container.Container (ContainerImpl.java:handle(999)) - Container
> container_1442908447829_0001_01_000001 transitioned from RUNNING to 
> KILLING
> 2015-09-22 09:54:36,663 INFO  [AsyncDispatcher event handler] 
> launcher.ContainerLaunch (ContainerLaunch.java:cleanupContainer(370)) 
> - Cleaning up container container_1442908447829_0001_01_000001
> ________________________________
> Jordi Blasi Uribarri
> Área I+D+i
>
> [email protected]
> Oficina Bilbao
>
> [http://www.nextel.es/wp-content/uploads/Firma_Nextel_2015.png]
> ________________________________
> Jordi Blasi Uribarri
>

RE: container is running beyond virtual memory limits

Reply via email to