@Andreas

Hope you understood my use case above. I would appreciate if you please
shed some more light and share about using load balancer to route jobs and
keeping this load balancer outside. I would like to know how you are
suggesting enable this load balancer in RM or NN or you are suggesting to
write this in my service. Please inform me if you anything is not clear in
my use case.

Thanks.

On Mon, Dec 5, 2016 at 11:57 AM, mdk-swandha <[email protected]>
wrote:

> You mean I have to set env variables for each job/workflow execution and
> then it will be picked up by Oozie. And I should set them in my service
> (the service which is finding the best cluster?).
>
> For example let say I have 3 cluster:
> - When a job is sent via Oozie/Hue/Zepellin/Livy etc. - they are mapped to
> one cluster and jobs always goes there. Let say this as a default cluster
> - I have a service which determines what can be best cluster for a given
> job considering various attributes (availability, data locality, network
> bandwidth etc.)
> - This service has exposed an API and caller just passes the required
> parameters(job/input/output/queue etc.) and this service will return the
> best available cluster
>
> With what I have above, I feel keeping the calling code should be in the
> caller (Oozie/Zepellin/Any application) should be the way to go to keep it
> simple to isolate JT's default behavior. This won't disrupt existing jobs
> which are running on these clusters by introducing some new settings. May
> be I'm missing how are you advising creating load balancer setting in JT
> and configuring it during runtime. Can you please tell me more how this can
> be done?
>
> Thanks.
> -Dipesh
>
>
>
> On Mon, Dec 5, 2016 at 10:59 AM, Andras Piros <[email protected]>
> wrote:
>
>> Hi Dipesh,
>>
>> during workflow / job submission you can define variables inside
>> job.properties coming e.g. from env vars that are used in workflow.xml. So
>> much for the flexibility.
>>
>> Can you tell me a use case where runtime routing to different JT / NN
>> instances via Oozie (and not e.g. coming from a load balancer setting
>> configured runtime) is better?
>>
>> Thanks,
>>
>> Andras
>>
>> --
>> Andras PIROS
>> Software Engineer
>> <http://www.cloudera.com/>
>>
>> On Mon, Dec 5, 2016 at 7:45 PM, mdk-swandha <[email protected]>
>> wrote:
>>
>> > Hi Alex,
>> >
>> > The idea is to call this external service which will find the best
>> cluster
>> > and inform the caller. So today this caller is Oozie, tomorrow it will
>> be
>> > Zeppelin or any other application.
>> >
>> > How can I provide multiple JT and NN addresses in job.properties? You
>> mean
>> > during job/workflow creation? I will still need to overwrite
>> job.properties
>> > or provide these values somewhere dynamically?
>> >
>> > Thanks.
>> > -Dipesh
>> >
>> > On Mon, Dec 5, 2016 at 5:24 AM, Andras Piros <[email protected]
>> >
>> > wrote:
>> >
>> > > Hi Dipesh,
>> > >
>> > > seems like a bad idea to programmatically change job-tracker or
>> > > name-node properties
>> > > - it's just not the task of Oozie to determine what are the exact JT
>> or
>> > NN
>> > > instances Oozie should use.
>> > >
>> > > Instead, I'd rather setup a load balancer for JT and another one for
>> NN,
>> > > and provide those addresses to Oozie's job.properties. That way, we
>> > > separate concerns - the load balancer can choose the JT or NN node
>> > runtime,
>> > > e.g. on a round robin basis.
>> > >
>> > > Regards,
>> > >
>> > > Andras
>> > >
>> > > --
>> > > Andras PIROS
>> > > Software Engineer
>> > > <http://www.cloudera.com/>
>> > >
>> > > On Thu, Dec 1, 2016 at 9:29 PM, mdk-swandha <[email protected]
>> >
>> > > wrote:
>> > >
>> > > > Hi,
>> > > >
>> > > > I have a use case like this - in a multi cluster (hadoop cluster)
>> > > > environment if I would like to send a job/oozie workflow to a
>> desired
>> > > > cluster during runtime, how can this be done.
>> > > >
>> > > > I see that there is JavaActionExecutor class which read NN and
>> > JobTracker
>> > > > in createBaseHadoopConf method
>> > > >
>> > > > All HadoopActionExectors are derived from JavaActionExecutor so this
>> > > seems
>> > > > to be a place wherein I can insert my code. How can I do this
>> without
>> > > > disrupting the original flow by adding my hook.
>> > > >
>> > > > One option is to to derive my new JavaActionExecutor and over ride
>> > > > createBaseHadoopConf method and then derive all ActionExecutors
>> from my
>> > > new
>> > > > JavaActionExecutor. It doesn't seem to be elegant to me, so thought
>> to
>> > > ask
>> > > > out here.
>> > > >
>> > > > Any input will be useful.
>> > > >
>> > > > Thanks.
>> > > > -Dipesh
>> > > >
>> > >
>> >
>>
>
>

Reply via email to