Re: How are samza.container.id generated in yarn?

2021-04-28 Thread Debraj Manna
Anyone any thoughts on my below query?

Can you please provide a bit more details or point me to the code how samza
> is mapping yarn container id to durable sequential integers?


On Sat, Apr 24, 2021 at 10:45 AM Debraj Manna 
wrote:

> Thanks Jagadish for replying. Yes I am treating it as a opaque durable I'd
> to execute some code on a container specific container.
>
> Just for my understanding can you please provide a bit more details or
> point me to the code
> how samza is mapping yarn container id to durable sequential integers?
>
> On Sat, 24 Apr 2021, 01:44 Jagadish Venkatraman, 
> wrote:
>
>> Hi Debraj,
>>
>> Clarifying a bit on Yi’s response, since it was referring to the physical
>> Yarn container id..
>>
>> If there are N Yarn containers, samza.container.ids are generated
>> sequentially from 0 to N-1. This ID is meant to be durable - ie., if a
>> particular container fails, the Samza AM will restart it with the same ID.
>>
>> Having said that, you should just treat it as an opaque key that uniquely
>> identifies a container within a Samza job.
>>
>> Could you share some details on how you intend to use this?
>>
>> On Friday, April 23, 2021, Debraj Manna  wrote:
>>
>> > Thanks, Yi for replying.
>> >
>> > I also checked that class. But Container#getId().toString()
>> > returna s string like container_e02_1619095810959_0006_10_04. But I
>> am
>> > seeing samza.container.id is an integer like 0, 1 that is getting set
>> as
>> > system var. Can you let me know how Container#getId().toString() is
>> getting
>> > mapped to an integer?
>> >
>> > For example, below is the output of ps -ef for samza yarn container and
>> I
>> > am seeing *-Dsamza.container.id =2* &
>> > *-Dsamza.container.name
>> > =samza-container-2*
>> >
>> > yarn  7706  7704  7 Apr22 ?02:08:23
>> > /usr/lib/jvm/zulu-11-amd64/bin/java -Xmx8820M
>> > -XX:-OmitStackTraceInFastThrow -XX:NewRatio=8 -Xss256K
>> > -XX:+ExitOnOutOfMemoryError -XX:+HeapDumpOnOutOfMemoryError
>> > -XX:HeapDumpPath=/var/lib/heap-dumps/samzajobs
>> > -XX:NativeMemoryTracking=summary -Dio.netty.allocator.type=unpooled
>> > -Dio.grpc.netty.shaded.io.netty.allocator.type=unpooled -server
>> > *-Dsamza.container.id
>> > =2 -Dsamza.container.name
>> > =samza-container-2*
>> > -DisThreadContextMapInheritable=true
>> > -Dlog4j.configuration=file:/var/lib/hadoop-yarn/cache/
>> > yarn/nm-local-dir/usercache/ubuntu/appcache/application_
>> > 1619095810959_0006/container_e02_1619095810959_0006_10_
>> > 04/__package/lib/log4j.xml
>> >
>> -Dsamza.log.dir=/var/log/hadoop-yarn/containers/application_1619095810959_
>> > 0006/container_e02_1619095810959_0006_10_04
>> > -Djava.io.tmpdir=/var/lib/hadoop-yarn/cache/yarn/nm-
>> > local-dir/usercache/ubuntu/appcache/application_
>> > 1619095810959_0006/container_e02_1619095810959_0006_10_
>> > 04/__package/tmp
>> > -cp
>> > /etc/hadoop/conf::/var/lib/hadoop-yarn/cache/yarn/nm-
>> > local-dir/usercache/ubuntu/appcache/application_
>> > 1619095810959_0006/container_e02_1619095810959_0006_10_
>> > 04/__package/lib/activation-1.1.jar:/var/lib/
>> > hadoop-yarn/cache/yarn/nm-local-dir/usercache/ubuntu/appcache/
>> >
>> >
>> > On Fri, Apr 23, 2021 at 11:09 AM Yi Pan  wrote:
>> >
>> > > Hi, Debraj,
>> > >
>> > > In YARN environment, Samza uses YARN generated containerIds as
>> > > environmental variables to set each container process's
>> > samza.container.id
>> > > .
>> > > i.e. when containers are requested by Samza AM process in YARN, YARN
>> RM
>> > > will reply with a set of allocated container objects, which is of
>> class
>> > > org.apache.hadoop.yarn.api.records.Container. That's the resource
>> class
>> > to
>> > > uniquely identify a container in YARN and
>> Container#getId().toString() is
>> > > the container ID string we set to samza.container.id.
>> > >
>> > > Best,
>> > >
>> > > -Yi
>> > >
>> > > On Wed, Apr 21, 2021 at 11:28 PM Debraj Manna <
>> subharaj.ma...@gmail.com>
>> > > wrote:
>> > >
>> > > > The same has been asked in stackoverflow
>> > > > <
>> > > >
>> > > https://stackoverflow.com/questions/67207850/how-does-
>> > samza-generate-the-container-id-when-the-application-is-deployed-in-yar
>> > > > >
>> > > > also. Anyone any thoughts on this?
>> > > >
>> > > >
>> > > >
>> > > https://stackoverflow.com/questions/67207850/how-does-
>> > samza-generate-the-container-id-when-the-application-is-deployed-in-yar
>> > > >
>> > > > On Wed, Apr 21, 2021 at 6:08 PM Debraj Manna <
>> subharaj.ma...@gmail.com
>> > >
>> > > > wrote:
>> > > >
>> > > > > Hi
>> > > > >
>> > > > > Can someone let me know how is "samza.container.id" generated
>> when a
>> > > > > samza app is running in yarn?
>> > > > >
>> > > > > Thanks,
>> > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>>
>> --
>> Jagadish
>>
>


Re: How are samza.container.id generated in yarn?

2021-04-23 Thread Debraj Manna
Thanks Jagadish for replying. Yes I am treating it as a opaque durable I'd
to execute some code on a container specific container.

Just for my understanding can you please provide a bit more details or
point me to the code
how samza is mapping yarn container id to durable sequential integers?

On Sat, 24 Apr 2021, 01:44 Jagadish Venkatraman, 
wrote:

> Hi Debraj,
>
> Clarifying a bit on Yi’s response, since it was referring to the physical
> Yarn container id..
>
> If there are N Yarn containers, samza.container.ids are generated
> sequentially from 0 to N-1. This ID is meant to be durable - ie., if a
> particular container fails, the Samza AM will restart it with the same ID.
>
> Having said that, you should just treat it as an opaque key that uniquely
> identifies a container within a Samza job.
>
> Could you share some details on how you intend to use this?
>
> On Friday, April 23, 2021, Debraj Manna  wrote:
>
> > Thanks, Yi for replying.
> >
> > I also checked that class. But Container#getId().toString()
> > returna s string like container_e02_1619095810959_0006_10_04. But I
> am
> > seeing samza.container.id is an integer like 0, 1 that is getting set as
> > system var. Can you let me know how Container#getId().toString() is
> getting
> > mapped to an integer?
> >
> > For example, below is the output of ps -ef for samza yarn container and I
> > am seeing *-Dsamza.container.id =2* &
> > *-Dsamza.container.name
> > =samza-container-2*
> >
> > yarn  7706  7704  7 Apr22 ?02:08:23
> > /usr/lib/jvm/zulu-11-amd64/bin/java -Xmx8820M
> > -XX:-OmitStackTraceInFastThrow -XX:NewRatio=8 -Xss256K
> > -XX:+ExitOnOutOfMemoryError -XX:+HeapDumpOnOutOfMemoryError
> > -XX:HeapDumpPath=/var/lib/heap-dumps/samzajobs
> > -XX:NativeMemoryTracking=summary -Dio.netty.allocator.type=unpooled
> > -Dio.grpc.netty.shaded.io.netty.allocator.type=unpooled -server
> > *-Dsamza.container.id
> > =2 -Dsamza.container.name
> > =samza-container-2*
> > -DisThreadContextMapInheritable=true
> > -Dlog4j.configuration=file:/var/lib/hadoop-yarn/cache/
> > yarn/nm-local-dir/usercache/ubuntu/appcache/application_
> > 1619095810959_0006/container_e02_1619095810959_0006_10_
> > 04/__package/lib/log4j.xml
> >
> -Dsamza.log.dir=/var/log/hadoop-yarn/containers/application_1619095810959_
> > 0006/container_e02_1619095810959_0006_10_04
> > -Djava.io.tmpdir=/var/lib/hadoop-yarn/cache/yarn/nm-
> > local-dir/usercache/ubuntu/appcache/application_
> > 1619095810959_0006/container_e02_1619095810959_0006_10_
> > 04/__package/tmp
> > -cp
> > /etc/hadoop/conf::/var/lib/hadoop-yarn/cache/yarn/nm-
> > local-dir/usercache/ubuntu/appcache/application_
> > 1619095810959_0006/container_e02_1619095810959_0006_10_
> > 04/__package/lib/activation-1.1.jar:/var/lib/
> > hadoop-yarn/cache/yarn/nm-local-dir/usercache/ubuntu/appcache/
> >
> >
> > On Fri, Apr 23, 2021 at 11:09 AM Yi Pan  wrote:
> >
> > > Hi, Debraj,
> > >
> > > In YARN environment, Samza uses YARN generated containerIds as
> > > environmental variables to set each container process's
> > samza.container.id
> > > .
> > > i.e. when containers are requested by Samza AM process in YARN, YARN RM
> > > will reply with a set of allocated container objects, which is of class
> > > org.apache.hadoop.yarn.api.records.Container. That's the resource class
> > to
> > > uniquely identify a container in YARN and Container#getId().toString()
> is
> > > the container ID string we set to samza.container.id.
> > >
> > > Best,
> > >
> > > -Yi
> > >
> > > On Wed, Apr 21, 2021 at 11:28 PM Debraj Manna <
> subharaj.ma...@gmail.com>
> > > wrote:
> > >
> > > > The same has been asked in stackoverflow
> > > > <
> > > >
> > > https://stackoverflow.com/questions/67207850/how-does-
> > samza-generate-the-container-id-when-the-application-is-deployed-in-yar
> > > > >
> > > > also. Anyone any thoughts on this?
> > > >
> > > >
> > > >
> > > https://stackoverflow.com/questions/67207850/how-does-
> > samza-generate-the-container-id-when-the-application-is-deployed-in-yar
> > > >
> > > > On Wed, Apr 21, 2021 at 6:08 PM Debraj Manna <
> subharaj.ma...@gmail.com
> > >
> > > > wrote:
> > > >
> > > > > Hi
> > > > >
> > > > > Can someone let me know how is "samza.container.id" generated
> when a
> > > > > samza app is running in yarn?
> > > > >
> > > > > Thanks,
> > > > >
> > > > >
> > > >
> > >
> >
>
>
> --
> Jagadish
>


Re: How are samza.container.id generated in yarn?

2021-04-23 Thread Jagadish Venkatraman
Hi Debraj,

Clarifying a bit on Yi’s response, since it was referring to the physical
Yarn container id..

If there are N Yarn containers, samza.container.ids are generated
sequentially from 0 to N-1. This ID is meant to be durable - ie., if a
particular container fails, the Samza AM will restart it with the same ID.

Having said that, you should just treat it as an opaque key that uniquely
identifies a container within a Samza job.

Could you share some details on how you intend to use this?

On Friday, April 23, 2021, Debraj Manna  wrote:

> Thanks, Yi for replying.
>
> I also checked that class. But Container#getId().toString()
> returna s string like container_e02_1619095810959_0006_10_04. But I am
> seeing samza.container.id is an integer like 0, 1 that is getting set as
> system var. Can you let me know how Container#getId().toString() is getting
> mapped to an integer?
>
> For example, below is the output of ps -ef for samza yarn container and I
> am seeing *-Dsamza.container.id =2* &
> *-Dsamza.container.name
> =samza-container-2*
>
> yarn  7706  7704  7 Apr22 ?02:08:23
> /usr/lib/jvm/zulu-11-amd64/bin/java -Xmx8820M
> -XX:-OmitStackTraceInFastThrow -XX:NewRatio=8 -Xss256K
> -XX:+ExitOnOutOfMemoryError -XX:+HeapDumpOnOutOfMemoryError
> -XX:HeapDumpPath=/var/lib/heap-dumps/samzajobs
> -XX:NativeMemoryTracking=summary -Dio.netty.allocator.type=unpooled
> -Dio.grpc.netty.shaded.io.netty.allocator.type=unpooled -server
> *-Dsamza.container.id
> =2 -Dsamza.container.name
> =samza-container-2*
> -DisThreadContextMapInheritable=true
> -Dlog4j.configuration=file:/var/lib/hadoop-yarn/cache/
> yarn/nm-local-dir/usercache/ubuntu/appcache/application_
> 1619095810959_0006/container_e02_1619095810959_0006_10_
> 04/__package/lib/log4j.xml
> -Dsamza.log.dir=/var/log/hadoop-yarn/containers/application_1619095810959_
> 0006/container_e02_1619095810959_0006_10_04
> -Djava.io.tmpdir=/var/lib/hadoop-yarn/cache/yarn/nm-
> local-dir/usercache/ubuntu/appcache/application_
> 1619095810959_0006/container_e02_1619095810959_0006_10_
> 04/__package/tmp
> -cp
> /etc/hadoop/conf::/var/lib/hadoop-yarn/cache/yarn/nm-
> local-dir/usercache/ubuntu/appcache/application_
> 1619095810959_0006/container_e02_1619095810959_0006_10_
> 04/__package/lib/activation-1.1.jar:/var/lib/
> hadoop-yarn/cache/yarn/nm-local-dir/usercache/ubuntu/appcache/
>
>
> On Fri, Apr 23, 2021 at 11:09 AM Yi Pan  wrote:
>
> > Hi, Debraj,
> >
> > In YARN environment, Samza uses YARN generated containerIds as
> > environmental variables to set each container process's
> samza.container.id
> > .
> > i.e. when containers are requested by Samza AM process in YARN, YARN RM
> > will reply with a set of allocated container objects, which is of class
> > org.apache.hadoop.yarn.api.records.Container. That's the resource class
> to
> > uniquely identify a container in YARN and Container#getId().toString() is
> > the container ID string we set to samza.container.id.
> >
> > Best,
> >
> > -Yi
> >
> > On Wed, Apr 21, 2021 at 11:28 PM Debraj Manna 
> > wrote:
> >
> > > The same has been asked in stackoverflow
> > > <
> > >
> > https://stackoverflow.com/questions/67207850/how-does-
> samza-generate-the-container-id-when-the-application-is-deployed-in-yar
> > > >
> > > also. Anyone any thoughts on this?
> > >
> > >
> > >
> > https://stackoverflow.com/questions/67207850/how-does-
> samza-generate-the-container-id-when-the-application-is-deployed-in-yar
> > >
> > > On Wed, Apr 21, 2021 at 6:08 PM Debraj Manna  >
> > > wrote:
> > >
> > > > Hi
> > > >
> > > > Can someone let me know how is "samza.container.id" generated when a
> > > > samza app is running in yarn?
> > > >
> > > > Thanks,
> > > >
> > > >
> > >
> >
>


-- 
Jagadish


Re: How are samza.container.id generated in yarn?

2021-04-23 Thread Debraj Manna
Thanks, Yi for replying.

I also checked that class. But Container#getId().toString()
returna s string like container_e02_1619095810959_0006_10_04. But I am
seeing samza.container.id is an integer like 0, 1 that is getting set as
system var. Can you let me know how Container#getId().toString() is getting
mapped to an integer?

For example, below is the output of ps -ef for samza yarn container and I
am seeing *-Dsamza.container.id =2* &
*-Dsamza.container.name
=samza-container-2*

yarn  7706  7704  7 Apr22 ?02:08:23
/usr/lib/jvm/zulu-11-amd64/bin/java -Xmx8820M
-XX:-OmitStackTraceInFastThrow -XX:NewRatio=8 -Xss256K
-XX:+ExitOnOutOfMemoryError -XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/var/lib/heap-dumps/samzajobs
-XX:NativeMemoryTracking=summary -Dio.netty.allocator.type=unpooled
-Dio.grpc.netty.shaded.io.netty.allocator.type=unpooled -server
*-Dsamza.container.id
=2 -Dsamza.container.name
=samza-container-2*
-DisThreadContextMapInheritable=true
-Dlog4j.configuration=file:/var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/ubuntu/appcache/application_1619095810959_0006/container_e02_1619095810959_0006_10_04/__package/lib/log4j.xml
-Dsamza.log.dir=/var/log/hadoop-yarn/containers/application_1619095810959_0006/container_e02_1619095810959_0006_10_04
-Djava.io.tmpdir=/var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/ubuntu/appcache/application_1619095810959_0006/container_e02_1619095810959_0006_10_04/__package/tmp
-cp
/etc/hadoop/conf::/var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/ubuntu/appcache/application_1619095810959_0006/container_e02_1619095810959_0006_10_04/__package/lib/activation-1.1.jar:/var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/ubuntu/appcache/


On Fri, Apr 23, 2021 at 11:09 AM Yi Pan  wrote:

> Hi, Debraj,
>
> In YARN environment, Samza uses YARN generated containerIds as
> environmental variables to set each container process's samza.container.id
> .
> i.e. when containers are requested by Samza AM process in YARN, YARN RM
> will reply with a set of allocated container objects, which is of class
> org.apache.hadoop.yarn.api.records.Container. That's the resource class to
> uniquely identify a container in YARN and Container#getId().toString() is
> the container ID string we set to samza.container.id.
>
> Best,
>
> -Yi
>
> On Wed, Apr 21, 2021 at 11:28 PM Debraj Manna 
> wrote:
>
> > The same has been asked in stackoverflow
> > <
> >
> https://stackoverflow.com/questions/67207850/how-does-samza-generate-the-container-id-when-the-application-is-deployed-in-yar
> > >
> > also. Anyone any thoughts on this?
> >
> >
> >
> https://stackoverflow.com/questions/67207850/how-does-samza-generate-the-container-id-when-the-application-is-deployed-in-yar
> >
> > On Wed, Apr 21, 2021 at 6:08 PM Debraj Manna 
> > wrote:
> >
> > > Hi
> > >
> > > Can someone let me know how is "samza.container.id" generated when a
> > > samza app is running in yarn?
> > >
> > > Thanks,
> > >
> > >
> >
>


Re: How are samza.container.id generated in yarn?

2021-04-22 Thread Yi Pan
Hi, Debraj,

In YARN environment, Samza uses YARN generated containerIds as
environmental variables to set each container process's samza.container.id.
i.e. when containers are requested by Samza AM process in YARN, YARN RM
will reply with a set of allocated container objects, which is of class
org.apache.hadoop.yarn.api.records.Container. That's the resource class to
uniquely identify a container in YARN and Container#getId().toString() is
the container ID string we set to samza.container.id.

Best,

-Yi

On Wed, Apr 21, 2021 at 11:28 PM Debraj Manna 
wrote:

> The same has been asked in stackoverflow
> <
> https://stackoverflow.com/questions/67207850/how-does-samza-generate-the-container-id-when-the-application-is-deployed-in-yar
> >
> also. Anyone any thoughts on this?
>
>
> https://stackoverflow.com/questions/67207850/how-does-samza-generate-the-container-id-when-the-application-is-deployed-in-yar
>
> On Wed, Apr 21, 2021 at 6:08 PM Debraj Manna 
> wrote:
>
> > Hi
> >
> > Can someone let me know how is "samza.container.id" generated when a
> > samza app is running in yarn?
> >
> > Thanks,
> >
> >
>


Re: How are samza.container.id generated in yarn?

2021-04-22 Thread Debraj Manna
The same has been asked in stackoverflow

also. Anyone any thoughts on this?

https://stackoverflow.com/questions/67207850/how-does-samza-generate-the-container-id-when-the-application-is-deployed-in-yar

On Wed, Apr 21, 2021 at 6:08 PM Debraj Manna 
wrote:

> Hi
>
> Can someone let me know how is "samza.container.id" generated when a
> samza app is running in yarn?
>
> Thanks,
>
>


How are samza.container.id generated in yarn?

2021-04-21 Thread Debraj Manna
Hi

Can someone let me know how is "samza.container.id" generated when a samza
app is running in yarn?

Thanks,