On Fri, Jan 16, 2015 at 7:22 AM, Scott Preece <[email protected]>
wrote:

> Coming to this from the perspective of knowing the architecture of Falcon
> only in rough terms and having no experience with Quartz and very little
> with Oozie, I've got a few questions about the proposal:
> - Looking at the Quartz site it talks about time-based triggers, but not
> about triggering on availability of either files or other resources; does
> it do that? It would be good to test your list of use cases against the
> proposed solution.
>
The Triggers are quite extensible and we need to add that. We could borrow
code from Oozie for data-availability triggers.


> - How would Quartz integrate with Oozie, assuming Oozie is still doing the
> workflow execution? By JMS messages?
>
You could invoke a workflow execution directly with Oozie.


> - Is triggering by external events/messages, as opposed to by its own
> scheduler, a natural mode for Oozie (are there interfaces at the right
> level)?
>
It does support directly invoking oozie workflows without tying it to a
scheduler.


> - Is Oozie decomposable so that it would be reasonable to only include the
> execution parts and not the scheduling parts?
>
Thats quite hard from what I know.


> regards,scott
>
>      On Friday, January 16, 2015 12:37 AM, Siva Thumma <
> [email protected]> wrote:
>
>
>  Going for confluence page is upto the core team. For my knowledge this is
> too early.
> This is broader as we are not quite clear of the complete usecases this
> development would solve, if  incorporated from scratch, away from oozie.
>
>
> > On 16-Jan-2015, at 7:43 am, Srikanth Sundarrajan <[email protected]>
> wrote:
> >
> > This is a very important decision for the project and if we need to
> discuss this more, we should and not rush through. So I will hold of any
> further action in terms pressing forward with the design.
> >
> > Here is the consolidations of views expressed so far on this thread.
> Folks who have responded, please chime in if I have misrepresented any one.
> >
> > Sanjeev: Agreed with the proposal
> > Ajay: Agreed with the proposal and wanted to know how it will be
> implemented
> > Siva Tumma: -1, as repeating some functionality in Oozie seemed wasteful
> > Venkatesh: -1 initially to this being built in Falcon, but ok with
> leveraging capabilities through alternate scheduler such as Quartz/Yarn.
> Subsequently expressed how chugging along with Oozie is not ideal in the
> long run
> > Shwetha: Ok with replacing Oozie altogehter including workflow
> execution. She felt that some of these may exist in Oozie and yet to revert
> if they really are.
> > JB: Initially had reservations to repeating functionality in Falcon,
> later +1
> > Shaik: Agreed to the proposal, additionally calling out more
> capabilitiies than was originally called out in the initial thread.
> > Srikanth: I would like to provide lot more capabilities to users than
> what is supported and really like for this to happen, so +1
> >
> > Regards
> > Srikanth Sundarrajan
> >
> >> Date: Thu, 15 Jan 2015 11:27:17 -0800
> >> Subject: Re: [DISCUSS] Orchestration in Falcon
> >> From: [email protected]
> >> To: [email protected]
> >>
> >> On Thu, Jan 15, 2015 at 1:25 AM, Srikanth Sundarrajan <
> [email protected]>
> >> wrote:
> >>
> >>> [email protected]
> >>>
> >>> It looks like we have broad consensus on this,
> >>
> >> Really? Thats not how I read this? I'm still not sure its worth taking
> on
> >> this complexity into Falcon. Did we even explore other options? I'm not
> >> sure.
> >>
> >>
> >>> should we open up a discuss thread on how we go about this ?
> >>
> >> May be.
> >>
> >>
> >>> Or should we create a confluence page and collaborate through that ?
> >> Too early for this.
> >>
> >>
> >>>
> >>> Regards
> >>> Srikanth Sundarrajan
> >>>
> >>>> From: [email protected]
> >>>> Date: Thu, 1 Jan 2015 22:40:48 +0530
> >>>> Subject: Re: [DISCUSS] Orchestration in Falcon
> >>>> To: [email protected]
> >>>>
> >>>> +1.
> >>>>
> >>>> Few more relevant asks:
> >>>> 1. Support for "Last Only" option for process scheduling (In addition
> to
> >>>> LIFO/FIFO), currently oozie has some issues.
> >>>> 2. Support for Singleton process (lock based), the behaviour of all
> >>>> instances of process is same.
> >>>>
> >>>> Thanks,
> >>>> -Idris
> >>>>
> >>>>
> >>>> On Thu, Jan 1, 2015 at 7:51 PM, Jean-Baptiste Onofré <[email protected]
> >
> >>>> wrote:
> >>>>
> >>>>> +1
> >>>>>
> >>>>> Regards
> >>>>> JB
> >>>>>
> >>>>>
> >>>>>> On 12/31/2014 03:53 PM, Srikanth Sundarrajan wrote:
> >>>>>>
> >>>>>> Can we pick up this thread in the new year when folks are back from
> >>>>>> break? I am in total agreement with Venkatesh here. We ought to have
> >>> a long
> >>>>>> term sustainable approach. Also I feel that the capabilities that we
> >>> would
> >>>>>> like to enable on falcon and getting them done through oozie in near
> >>> term
> >>>>>> seems to be a tall ask anyways.
> >>>>>>
> >>>>>> Regards
> >>>>>> Srikanth Sundarrajan
> >>>>>>
> >>>>>> Date: Tue, 23 Dec 2014 16:44:06 -0800
> >>>>>>> Subject: Re: [DISCUSS] Orchestration in Falcon
> >>>>>>> From: [email protected]
> >>>>>>> To: [email protected]
> >>>>>>>
> >>>>>>> Chugging along with Oozie is bad for Falcon in the long run, for
> >>> users
> >>>>>>> and
> >>>>>>> developers. Its horribly complex to work through the many rough
> edges
> >>>>>>> architecturally in Oozie. Look at all the patches for security that
> >>> I had
> >>>>>>> to fix around Oozie. Its unnecessarily very complex, non-uniform
> and
> >>> is
> >>>>>>> NOT
> >>>>>>> meant to be used by another tool like Falcon but was built around
> end
> >>>>>>> user.
> >>>>>>>
> >>>>>>> This is a good discussion to have - may be explore oozie for
> >>> short-term
> >>>>>>> but
> >>>>>>> look at alternative solutions for the long-term.
> >>>>>>>
> >>>>>>> On Tue, Dec 23, 2014 at 7:28 AM, Srikanth Sundarrajan <
> >>>>>>> [email protected]>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>> @jb, There is no doubt merit in mapping them to oozie if possible
> >>> and if
> >>>>>>>> extensions are simple and straight forward enough.
> >>>>>>>>
> >>>>>>>> Also had a quick chat offline with Shwetha and she mentioned about
> >>> some
> >>>>>>>> work happening in Oozie in this regard. On further digging up,
> found
> >>>>>>>> https://issues.apache.org/jira/browse/OOZIE-1976. This is
> possibly
> >>> what
> >>>>>>>> Shwetha was referring to. From the looks of it, this tries to
> >>> address
> >>>>>>>> item
> >>>>>>>> #7 in the original thread.  May be there are more jiras where
> >>> additional
> >>>>>>>> work such as a-periodic datasets is being worked on. Perhaps
> >>> @Shwetha
> >>>>>>>> can
> >>>>>>>> throw some light on what is being considered and/or how these
> >>>>>>>> gating/orchestration use cases can be managed.
> >>>>>>>>
> >>>>>>>> Regards
> >>>>>>>> Srikanth Sundarrajan
> >>>>>>>>
> >>>>>>>> Date: Tue, 23 Dec 2014 11:06:24 +0100
> >>>>>>>>> From: [email protected]
> >>>>>>>>> To: [email protected]
> >>>>>>>>> Subject: Re: [DISCUSS] Orchestration in Falcon
> >>>>>>>>>
> >>>>>>>>> Hi all,
> >>>>>>>>>
> >>>>>>>>> I second Shwetha there. I think we can achieve such features in
> >>> Oozie
> >>>>>>>>> (with some adaptations).
> >>>>>>>>>
> >>>>>>>>> Regards
> >>>>>>>>> JB
> >>>>>>>>>
> >>>>>>>>> Le 2014-12-23 10:53, Shwetha G S a écrit :
> >>>>>>>>>
> >>>>>>>>>> If we can get rid of oozie entirely, yes we can explore other
> >>>>>>>>>> possibilities. But if we are still going to use oozie for DAG
> >>>>>>>>>> execution, we
> >>>>>>>>>> are going to add add another bottleneck in the whole
> >>>>>>>>>> execution(currently,
> >>>>>>>>>> falcon is not in the workflow execution path) and I don't think
> >>> its
> >>>>>>>>>> worth
> >>>>>>>>>> it.
> >>>>>>>>>>
> >>>>>>>>>> The features that are outlined above are all available in basic
> >>> forms
> >>>>>>>>>> in
> >>>>>>>>>> oozie and it should be easy to enhance them/make them as
> extension
> >>>>>>>>>> points.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> -Shwetha
> >>>>>>>>>>
> >>>>>>>>>> On Tue, Dec 23, 2014 at 8:12 AM, Srikanth Sundarrajan
> >>>>>>>>>> <[email protected]>
> >>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>> Here are few more gaps that we ought to solve for while we are
> >>> on the
> >>>>>>>>>>> subject:
> >>>>>>>>>>>
> >>>>>>>>>>> 1. Ability to attach to start & finish events of workflow
> >>> execution.
> >>>>>>>>>>> Currently we have post processing hook to listen to finish
> >>> events,
> >>>>>>>>>>> but
> >>>>>>>>>>> we
> >>>>>>>>>>> do run into scenarios where there are occasional failures with
> >>>>>>>>>>> post-processing and there is potential phase lag in learning
> >>> about
> >>>>>>>>>>> the
> >>>>>>>>>>> events.
> >>>>>>>>>>> 2. Strict enforcement of concurrency control possibly spanning
> >>>>>>>>>>> process
> >>>>>>>>>>> boundaries.
> >>>>>>>>>>> 3. Ability to tune how backlogs have to be caught up (old
> >>> instances
> >>>>>>>>>>> to
> >>>>>>>>>>> be
> >>>>>>>>>>> given higher priority, newer instances to be given higher
> >>> priority,
> >>>>>>>>>>> or
> >>>>>>>>>>> some
> >>>>>>>>>>> sort of weights to allow both to make progress at varying
> rates).
> >>>>>>>>>>> There
> >>>>>>>>>>> have been asks for routing current vs older instances to
> >>> different
> >>>>>>>>>>> queues
> >>>>>>>>>>> by users as an alternative.
> >>>>>>>>>>> 4. Ability to have a notion of non-time based feed instances
> and
> >>>>>>>>>>> related
> >>>>>>>>>>> coordination.
> >>>>>>>>>>> 5. Currently keeping track of and managing SLAs is also a
> >>> challenge,
> >>>>>>>>>>> but
> >>>>>>>>>>> with #1 addressed, this might be a lesser concern.
> >>>>>>>>>>>
> >>>>>>>>>>> Regards
> >>>>>>>>>>> Srikanth Sundarrajan
> >>>>>>>>>>>
> >>>>>>>>>>> Subject: Re: [DISCUSS] Orchestration in Falcon
> >>>>>>>>>>>> From: [email protected]
> >>>>>>>>>>>> Date: Tue, 23 Dec 2014 06:30:30 +0530
> >>>>>>>>>>>> To: [email protected]
> >>>>>>>>>>>>
> >>>>>>>>>>>> @venkatesh, the question really is how do we enable these
> >>> gating pre
> >>>>>>>>>>> conditions. Seems hard enough to add them to oozie, but am not
> >>>>>>>>>>> intimately
> >>>>>>>>>>> familiar with oozie to comment on how hard or easy it is. Like
> I
> >>>>>>>>>>> responded
> >>>>>>>>>>> to @ajay on the same thread, if we are to do away with
> >>> coordination
> >>>>>>>>>>> through
> >>>>>>>>>>> oozie, we can follow up this discussion with approaches and
> >>> design.
> >>>>>>>>>>> Though
> >>>>>>>>>>> I had quartz in my mind, wanted to leave that out of discussion
> >>> to
> >>>>>>>>>>> see
> >>>>>>>>>>> if
> >>>>>>>>>>> there is consensus for moving away from oozie coords and
> >>> implementing
> >>>>>>>>>>> them
> >>>>>>>>>>> through other means.
> >>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Sent from my iPhone
> >>>>>>>>>>>>
> >>>>>>>>>>>> On 23-Dec-2014, at 1:16 am, "Seetharam Venkatesh" <
> >>>>>>>>>>>> [email protected]> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>> What is the purpose of this decoupling? Why build this into
> >>>>>>>>>>>> Falcon?
> >>>>>>>>
> >>>>>>>>> Scheduling is so common that there are dime a dozen schedulers
> >>>>>>>>>>>> today
> >>>>>>>>
> >>>>>>>>> and
> >>>>>>>>>>>
> >>>>>>>>>>>> they are all extensible with custom triggers. Making it part
> of
> >>>>>>>>>>>> Falcon
> >>>>>>>>
> >>>>>>>>> will
> >>>>>>>>>>>
> >>>>>>>>>>>> suffer the same issues that Oozie has today.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I'm sorry but I'm a HUGE -1 to this being built into Falcon
> >>>>>>>>>>>> codebase.
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>>>> However, I'm +1 to reusing Quartz scheduler that already
> >>> exists -
> >>>>>>>>>>>> stand it
> >>>>>>>>>>>
> >>>>>>>>>>>> up outside or embed it like we do for active MQ.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Phase 2 - I'd like to see we write a simple DAG execution
> >>> layer in
> >>>>>>>>>>>> YARN as
> >>>>>>>>>>>
> >>>>>>>>>>>> an app master with out DB and keeps state on HDFS as an
> >>> alternate
> >>>>>>>>>>>> to
> >>>>>>>>
> >>>>>>>>> Oozie.
> >>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Then we will have a nimble falcon which can kick ass.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Sun, Dec 21, 2014 at 6:13 AM, Srikanth Sundarrajan <
> >>>>>>>>>>>> [email protected]>
> >>>>>>>>>>>
> >>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Hello Team,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Since its inception Falcon has used Oozie for process
> >>>>>>>>>>>>> orchestration as
> >>>>>>>>
> >>>>>>>>> well as feed life cycle phase executions, while this has worked
> >>>>>>>>>>>>> reasonably
> >>>>>>>>>>>
> >>>>>>>>>>>> and allowed to make higher level capabilities available
> through
> >>>>>>>>>>>>> Falcon, we
> >>>>>>>>>>>
> >>>>>>>>>>>> are increasing seeing scenarios where this is proving to be a
> >>>>>>>>>>>>> limiting
> >>>>>>>>
> >>>>>>>>> factor. In its current form, Falcon relies on Oozie for both
> >>>>>>>>>>>>> scheduling and
> >>>>>>>>>>>
> >>>>>>>>>>>> for workflow execution, due to which the scheduling is limited
> >>>>>>>>>>>>> to time
> >>>>>>>>
> >>>>>>>>> based/cron based scheduling with additional gating conditions on
> >>>>>>>>>>>>> data
> >>>>>>>>
> >>>>>>>>> availability. Also this imposes restrictions on datesets being
> >>>>>>>>>>>>>> periodic/cyclic in nature.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> From an orchestration stand point, it would help if we can
> >>>>>>>>>>>>> support
> >>>>>>>>
> >>>>>>>>> standard gating / scheduling primitives via Falcon:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> 1. Simple periodic scheduling with no gating conditions
> >>>>>>>>>>>>>> 2. Cron based scheduling (day of week, day of the month,
> >>> specific
> >>>>>>>>>>>>> hours
> >>>>>>>>>>>
> >>>>>>>>>>>> and non-periodic) with no gating conditions
> >>>>>>>>>>>>>> 3. Availability of new data (assuming monotonically
> increasing
> >>>>>>>>>>>>> data
> >>>>>>>>
> >>>>>>>>> version, availavility of new versions)
> >>>>>>>>>>>>>> 4. Changes to existing data (reinstatement - similar to late
> >>> data
> >>>>>>>>>>>>> handling)
> >>>>>>>>>>>
> >>>>>>>>>>>> 5. External trigger/notifications
> >>>>>>>>>>>>>> 6. Availability of specific instances of data as declared as
> >>>>>>>>>>>>> mandatory
> >>>>>>>>
> >>>>>>>>> dependency
> >>>>>>>>>>>>>> 7. Availability of a minimum subset of instances of data
> >>>>>>>>>>>>> declared as
> >>>>>>>>
> >>>>>>>>> mandatory depedency (at least 10 hourly instances of a day with
> >>>>>>>>>>>>> 24
> >>>>>>>>
> >>>>>>>>> instances for ex)
> >>>>>>>>>>>>>> 8. Valid combinations of the above.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> In this context, I would like to propose that we move away
> >>> from
> >>>>>>>>>>>>> Oozie
> >>>>>>>>
> >>>>>>>>> for
> >>>>>>>>>>>
> >>>>>>>>>>>> the orchestration requirements and have them implemented
> >>> natively
> >>>>>>>>>>>>> within
> >>>>>>>>>>>
> >>>>>>>>>>>> Falcon. It will no doubt make Falcon server bulkier and
> heavier
> >>>>>>>>>>>>> in
> >>>>>>>>
> >>>>>>>>> both
> >>>>>>>>>>>
> >>>>>>>>>>>> code and deployment, but seems like without it, the
> >>> orchestration
> >>>>>>>>>>>>> within
> >>>>>>>>>>>
> >>>>>>>>>>>> Falcon will be limited by capabilities available within Oozie.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Please do note that this suggestion is restricted to the
> >>>>>>>>>>>>> scheduling
> >>>>>>>>
> >>>>>>>>> and
> >>>>>>>>>>>
> >>>>>>>>>>>> not to the workflow execution.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Would like to hear from fellow developers and users on what
> >>> your
> >>>>>>>>>>>>> thoughts
> >>>>>>>>>>>
> >>>>>>>>>>>> are. Please do chime in with your views.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Regards
> >>>>>>>>>>>>>> Srikanth Sundarrajan
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> --
> >>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>> Venkatesh
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> “Perfection (in design) is achieved not when there is nothing
> >>>>>>>>>>>> more to
> >>>>>>>>
> >>>>>>>>> add,
> >>>>>>>>>>>
> >>>>>>>>>>>> but rather when there is nothing more to take away.”
> >>>>>>>>>>>>> - Antoine de Saint-Exupéry
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Regards,
> >>>>>>> Venkatesh
> >>>>>>>
> >>>>>>> “Perfection (in design) is achieved not when there is nothing more
> to
> >>>>>>> add,
> >>>>>>> but rather when there is nothing more to take away.”
> >>>>>>> - Antoine de Saint-Exupéry
> >>>>> --
> >>>>> Jean-Baptiste Onofré
> >>>>> [email protected]
> >>>>> http://blog.nanthrax.net
> >>>>> Talend - http://www.talend.com
> >>
> >>
> >>
> >> --
> >> Regards,
> >> Venkatesh
> >>
> >> “Perfection (in design) is achieved not when there is nothing more to
> add,
> >> but rather when there is nothing more to take away.”
> >> - Antoine de Saint-Exupéry
> >
>
>
>



-- 
Regards,
Venkatesh

“Perfection (in design) is achieved not when there is nothing more to add,
but rather when there is nothing more to take away.”
- Antoine de Saint-Exupéry

Reply via email to