Re: How to run multiple instances of the same job

Telles Nobrega Mon, 18 Aug 2014 09:48:01 -0700

I see. Thanks. Weird thing is it works some rounds and than stops.


On Mon, Aug 18, 2014 at 1:44 PM, Chris Riccomini <
[email protected]> wrote:

> Hey Telles,
>
> The problem could occur with HDFS. I believe that LOCALIZING just means
> that the NM is trying to download the artifact from wherever it is (be
> that HTTP, HDFS, etc).
>
> Cheers,
> Chris
>
> On 8/18/14 9:22 AM, "Telles Nobrega" <[email protected]> wrote:
>
> >Chris,
> >
> >I'm using HDFS, I will run again and see if the problem happens and I will
> >post if i find any problem or have more questions.
> >
> >Thanks.
> >
> >
> >On Mon, Aug 18, 2014 at 12:45 PM, Chris Riccomini <
> >[email protected]> wrote:
> >
> >> Hey Telles,
> >>
> >> Usually, when a job is stuck in LOCALIZING, it means that YARN is
> >> struggling to distribute your binary (the .tgz) to the appropriate
> >> NodeManagers, I think. You should check your NM logs and see if there
> >>are
> >> any hints about what's going on there.
> >>
> >> I've seen this in the past when the NM hangs trying to download a .tgz
> >> from the HTTP server for some reason.
> >>
> >> Cheers,
> >> Chris
> >>
> >> On 8/16/14 10:41 PM, "Telles Nobrega" <[email protected]> wrote:
> >>
> >> >I was able to fix this problem, now I¹m having another one. I¹m using a
> >> >script that starts kafka, deploys samza jobs, stop them, kills kafka
> >>and
> >> >delete configurations in zookeeper and kafka-log files. Them start over
> >> >again. I see that sometimes jobs don¹t start running, they stay in
> >> >accepted state with info LOCALIZING, what can be the cause for that?
> >> >
> >> >Thanks.
> >> >On 15 Aug 2014, at 19:18, Chris Riccomini
> >> ><[email protected]> wrote:
> >> >
> >> >> Hey Telles,
> >> >>
> >> >> If you set yarn.container.count to 5, you should get 5 containers.
> >>The
> >> >>two
> >> >> cases where you don't are:
> >> >>
> >> >> 1. The grid is at capacity, and doesn't have the memory to fulfill
> >>all
> >> >> container requests.
> >> >> 2. You set yarn.container.count higher than the number of partitions
> >> >>that
> >> >> your input stream has.
> >> >>
> >> >> Cheers,
> >> >> Chris
> >> >>
> >> >> On 8/15/14 1:56 PM, "Telles Nobrega" <[email protected]>
> wrote:
> >> >>
> >> >>> Hi Chris,
> >> >>>
> >> >>> I started playing with the yarn.container.count and set it to 5.
> >> >>>
> >> >>> At first I thought I had to compile the package again and republish
> >>to
> >> >>> hdfs
> >> >>> because I couldn't run 5 containers.
> >> >>> Then I recompiled but I still only got 3 containers, is that normal
> >> >>> behaviour?
> >> >>>
> >> >>> Thanks.
> >> >>>
> >> >>>
> >> >>> On Wed, Aug 13, 2014 at 5:00 PM, Telles Nobrega
> >> >>><[email protected]>
> >> >>> wrote:
> >> >>>
> >> >>>> Thanks Chris, i will take a look at this links and I will come back
> >> >>>>if I
> >> >>>> have more questions.
> >> >>>>
> >> >>>>
> >> >>>> On Wed, Aug 13, 2014 at 4:33 PM, Chris Riccomini <
> >> >>>> [email protected]> wrote:
> >> >>>>
> >> >>>>> Hey Telles,
> >> >>>>>
> >> >>>>>>> Should I use many kafka brokers or one will suffice?
> >> >>>>>
> >> >>>>> The number of brokers you use is dependent on the number of
> >> >>>>> messages/sec
> >> >>>>> you're going to receive, the size of those messages, and how long
> >> >>>>> you're
> >> >>>>> going to retain them.
> >> >>>>>
> >> >>>>> Here is a good blog post on Kafka performance that should give you
> >> >>>>>some
> >> >>>>> idea of the numbers:
> >> >>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>>
> >> https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-mil
> >> >>>>>li
> >> >>>>> on-
> >> >>>>> writes-second-three-cheap-machines
> >> >>>>>
> >> >>>>>
> >> >>>>><
> >> https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-mi
> >> >>>>>ll
> >> >>>>> ion-writes-second-three-cheap-machines>
> >> >>>>>
> >> >>>>>>> It could be just one job, but what is the best way to deploy
> >>many
> >> >>>>>>> instances of this job so I could process a heavy load of
> >>messages?
> >> >>>>>
> >> >>>>> You should adjust the yarn.container.count to increase the
> >> >>>>>parallelism
> >> >>>>> of
> >> >>>>> your job. By default, you get one container, but you can adjust
> >>this
> >> >>>>> up to
> >> >>>>> the total number of input partitions that you have. Have a look
> >>here
> >> >>>>> for
> >> >>>>> some details about how Samza's parallelism works:
> >> >>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>>
> >> http://samza.incubator.apache.org/learn/documentation/0.7.0/introducti
> >> >>>>>on
> >> >>>>> /co
> >> >>>>> ncepts.html
> >> >>>>>
> >> >>>>>
> >> >>>>><
> >> http://samza.incubator.apache.org/learn/documentation/0.7.0/introduct
> >> >>>>>io
> >> >>>>> n/concepts.html>
> >> >>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>> Cheers,
> >> >>>>> Chris
> >> >>>>>
> >> >>>>> On 8/13/14 9:37 AM, "Telles Nobrega" <[email protected]>
> >> wrote:
> >> >>>>>
> >> >>>>>> Should I use many kafka brokers or one will sufice?
> >> >>>>>>
> >> >>>>>> Thanks
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> On Wed, Aug 13, 2014 at 7:24 AM, Telles Nobrega
> >> >>>>> <[email protected]
> >> >>>>>>
> >> >>>>>> wrote:
> >> >>>>>>
> >> >>>>>>> It could be just one job, but what is the best way to deploy
> >>many
> >> >>>>>>> instances of this job so I could process a heavy load of
> >>messages?
> >> >>>>>>>
> >> >>>>>>> Thanks,
> >> >>>>>>>
> >> >>>>>>> On 13 Aug 2014, at 01:39, Yan Fang <[email protected]>
> wrote:
> >> >>>>>>>
> >> >>>>>>>> *"Does one kafka-broker handle this much messages per second?"*
> >> >>>>>>>>
> >> >>>>>>>> I believe @Chris has better answer about this.
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>> *"I have one job that get this messages and another that reads
> >> >>>>> from
> >> >>>>>>> the
> >> >>>>>>>> output of the first job that does some more processing."*
> >> >>>>>>>>
> >> >>>>>>>>   Why not use one job get messages and process them?
> >> >>>>>>>>
> >> >>>>>>>> *" when I change a*
> >> >>>>>>>>
> >> >>>>>>>> *configuration of one my jobs do I need to recompile it and
> >>send
> >> >>>>> the
> >> >>>>>>> new
> >> >>>>>>>> tar.gz to hdfs or just change the deploy/samza config and it
> >> >>>>> should
> >> >>>>>>> work."*
> >> >>>>>>>>
> >> >>>>>>>>   No, you don't need to recompile. Change the config and
> >> >>>>> run-job. It
> >> >>>>>>> will
> >> >>>>>>>> work.
> >> >>>>>>>>
> >> >>>>>>>> Thanks.
> >> >>>>>>>>
> >> >>>>>>>> Cheers,
> >> >>>>>>>>
> >> >>>>>>>> Fang, Yan
> >> >>>>>>>> [email protected]
> >> >>>>>>>> +1 (206) 849-4108
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>> On Tue, Aug 12, 2014 at 8:47 PM, Telles Nobrega
> >> >>>>>>> <[email protected]
> >> >>>>>>>>
> >> >>>>>>>> wrote:
> >> >>>>>>>>
> >> >>>>>>>>> Not completely related to the topic of the question but when I
> >> >>>>>>> change a
> >> >>>>>>>>> configuration of one my jobs do I need to recompile it and
> >>send
> >> >>>>> the
> >> >>>>>>> new
> >> >>>>>>>>> tar.gz to hdfs or just change the deploy/samza config and it
> >> >>>>> should
> >> >>>>>>> work.
> >> >>>>>>>>>
> >> >>>>>>>>> Thanks
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>> On Tue, Aug 12, 2014 at 11:23 PM, Telles Nobrega <
> >> >>>>>>> [email protected]>
> >> >>>>>>>>> wrote:
> >> >>>>>>>>>
> >> >>>>>>>>>> Hi, I'm running an experiment that I'm suppose to run samza
> >>with
> >> >>>>>>>>> different
> >> >>>>>>>>>> input rates. First I'm running with 420 messages/second and I
> >> >>>>> scale
> >> >>>>>>> up
> >> >>>>>>> to
> >> >>>>>>>>>> 33200 messages/second.
> >> >>>>>>>>>>
> >> >>>>>>>>>> Does one kafka-broker handle this much messages per second?
> >> >>>>>>>>>> Second, what is the best way to read into samza this much
> >> >>>>> messages?
> >> >>>>>>> I
> >> >>>>>>>>> have
> >> >>>>>>>>>> one job that get this messages and another that reads from
> >>the
> >> >>>>>>> output
> >> >>>>>>> of
> >> >>>>>>>>>> the first job that does some more processing. Is the best
> >>way to
> >> >>>>> use
> >> >>>>>>> more
> >> >>>>>>>>>> containers and split kafka topics in partitions (the same
> >> >>>>> number of
> >> >>>>>>>>>> containers) or is there a better way to do this.
> >> >>>>>>>>>>
> >> >>>>>>>>>> Thanks in advance,
> >> >>>>>>>>>>
> >> >>>>>>>>>> --
> >> >>>>>>>>>> ------------------------------------------
> >> >>>>>>>>>> Telles Mota Vidal Nobrega
> >> >>>>>>>>>> M.sc. Candidate at UFCG
> >> >>>>>>>>>> B.sc. in Computer Science at UFCG
> >> >>>>>>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
> >> >>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>> --
> >> >>>>>>>>> ------------------------------------------
> >> >>>>>>>>> Telles Mota Vidal Nobrega
> >> >>>>>>>>> M.sc. Candidate at UFCG
> >> >>>>>>>>> B.sc. in Computer Science at UFCG
> >> >>>>>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
> >> >>>>>>>>>
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> --
> >> >>>>>> ------------------------------------------
> >> >>>>>> Telles Mota Vidal Nobrega
> >> >>>>>> M.sc. Candidate at UFCG
> >> >>>>>> B.sc. in Computer Science at UFCG
> >> >>>>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
> >> >>>>>
> >> >>>>>
> >> >>>>
> >> >>>>
> >> >>>> --
> >> >>>> ------------------------------------------
> >> >>>> Telles Mota Vidal Nobrega
> >> >>>> M.sc. Candidate at UFCG
> >> >>>> B.sc. in Computer Science at UFCG
> >> >>>> Software Engineer at OpenStack Project - HP/LSD-UFCG
> >> >>>>
> >> >>>
> >> >>>
> >> >>>
> >> >>> --
> >> >>> ------------------------------------------
> >> >>> Telles Mota Vidal Nobrega
> >> >>> M.sc. Candidate at UFCG
> >> >>> B.sc. in Computer Science at UFCG
> >> >>> Software Engineer at OpenStack Project - HP/LSD-UFCG
> >> >
> >>
> >>
> >
> >
> >--
> >------------------------------------------
> >Telles Mota Vidal Nobrega
> >M.sc. Candidate at UFCG
> >B.sc. in Computer Science at UFCG
> >Software Engineer at OpenStack Project - HP/LSD-UFCG
>
>


-- 
------------------------------------------
Telles Mota Vidal Nobrega
M.sc. Candidate at UFCG
B.sc. in Computer Science at UFCG
Software Engineer at OpenStack Project - HP/LSD-UFCG

Re: How to run multiple instances of the same job

Reply via email to