Hi Debraj, Clarifying a bit on Yi’s response, since it was referring to the physical Yarn container id..
If there are N Yarn containers, samza.container.ids are generated sequentially from 0 to N-1. This ID is meant to be durable - ie., if a particular container fails, the Samza AM will restart it with the same ID. Having said that, you should just treat it as an opaque key that uniquely identifies a container within a Samza job. Could you share some details on how you intend to use this? On Friday, April 23, 2021, Debraj Manna <subharaj.ma...@gmail.com> wrote: > Thanks, Yi for replying. > > I also checked that class. But Container#getId().toString() > returna s string like container_e02_1619095810959_0006_10_000004. But I am > seeing samza.container.id is an integer like 0, 1 that is getting set as > system var. Can you let me know how Container#getId().toString() is getting > mapped to an integer? > > For example, below is the output of ps -ef for samza yarn container and I > am seeing *-Dsamza.container.id <http://Dsamza.container.id>=2* & > *-Dsamza.container.name > <http://Dsamza.container.name>=samza-container-2* > > yarn 7706 7704 7 Apr22 ? 02:08:23 > /usr/lib/jvm/zulu-11-amd64/bin/java -Xmx8820M > -XX:-OmitStackTraceInFastThrow -XX:NewRatio=8 -Xss256K > -XX:+ExitOnOutOfMemoryError -XX:+HeapDumpOnOutOfMemoryError > -XX:HeapDumpPath=/var/lib/heap-dumps/samzajobs > -XX:NativeMemoryTracking=summary -Dio.netty.allocator.type=unpooled > -Dio.grpc.netty.shaded.io.netty.allocator.type=unpooled -server > *-Dsamza.container.id > <http://Dsamza.container.id>=2 -Dsamza.container.name > <http://Dsamza.container.name>=samza-container-2* > -DisThreadContextMapInheritable=true > -Dlog4j.configuration=file:/var/lib/hadoop-yarn/cache/ > yarn/nm-local-dir/usercache/ubuntu/appcache/application_ > 1619095810959_0006/container_e02_1619095810959_0006_10_ > 000004/__package/lib/log4j.xml > -Dsamza.log.dir=/var/log/hadoop-yarn/containers/application_1619095810959_ > 0006/container_e02_1619095810959_0006_10_000004 > -Djava.io.tmpdir=/var/lib/hadoop-yarn/cache/yarn/nm- > local-dir/usercache/ubuntu/appcache/application_ > 1619095810959_0006/container_e02_1619095810959_0006_10_ > 000004/__package/tmp > -cp > /etc/hadoop/conf::/var/lib/hadoop-yarn/cache/yarn/nm- > local-dir/usercache/ubuntu/appcache/application_ > 1619095810959_0006/container_e02_1619095810959_0006_10_ > 000004/__package/lib/activation-1.1.jar:/var/lib/ > hadoop-yarn/cache/yarn/nm-local-dir/usercache/ubuntu/appcache/ > > > On Fri, Apr 23, 2021 at 11:09 AM Yi Pan <nickpa...@gmail.com> wrote: > > > Hi, Debraj, > > > > In YARN environment, Samza uses YARN generated containerIds as > > environmental variables to set each container process's > samza.container.id > > . > > i.e. when containers are requested by Samza AM process in YARN, YARN RM > > will reply with a set of allocated container objects, which is of class > > org.apache.hadoop.yarn.api.records.Container. That's the resource class > to > > uniquely identify a container in YARN and Container#getId().toString() is > > the container ID string we set to samza.container.id. > > > > Best, > > > > -Yi > > > > On Wed, Apr 21, 2021 at 11:28 PM Debraj Manna <subharaj.ma...@gmail.com> > > wrote: > > > > > The same has been asked in stackoverflow > > > < > > > > > https://stackoverflow.com/questions/67207850/how-does- > samza-generate-the-container-id-when-the-application-is-deployed-in-yar > > > > > > > also. Anyone any thoughts on this? > > > > > > > > > > > https://stackoverflow.com/questions/67207850/how-does- > samza-generate-the-container-id-when-the-application-is-deployed-in-yar > > > > > > On Wed, Apr 21, 2021 at 6:08 PM Debraj Manna <subharaj.ma...@gmail.com > > > > > wrote: > > > > > > > Hi > > > > > > > > Can someone let me know how is "samza.container.id" generated when a > > > > samza app is running in yarn? > > > > > > > > Thanks, > > > > > > > > > > > > > > -- Jagadish