Going for confluence page is upto the core team. For my knowledge this is too 
early.
This is broader as we are not quite clear of the complete usecases this 
development would solve, if  incorporated from scratch, away from oozie.


> On 16-Jan-2015, at 7:43 am, Srikanth Sundarrajan <[email protected]> wrote:
> 
> This is a very important decision for the project and if we need to discuss 
> this more, we should and not rush through. So I will hold of any further 
> action in terms pressing forward with the design.
> 
> Here is the consolidations of views expressed so far on this thread. Folks 
> who have responded, please chime in if I have misrepresented any one.
> 
> Sanjeev: Agreed with the proposal
> Ajay: Agreed with the proposal and wanted to know how it will be implemented
> Siva Tumma: -1, as repeating some functionality in Oozie seemed wasteful
> Venkatesh: -1 initially to this being built in Falcon, but ok with leveraging 
> capabilities through alternate scheduler such as Quartz/Yarn. Subsequently 
> expressed how chugging along with Oozie is not ideal in the long run
> Shwetha: Ok with replacing Oozie altogehter including workflow execution. She 
> felt that some of these may exist in Oozie and yet to revert if they really 
> are.
> JB: Initially had reservations to repeating functionality in Falcon, later +1
> Shaik: Agreed to the proposal, additionally calling out more capabilitiies 
> than was originally called out in the initial thread.
> Srikanth: I would like to provide lot more capabilities to users than what is 
> supported and really like for this to happen, so +1
> 
> Regards
> Srikanth Sundarrajan
> 
>> Date: Thu, 15 Jan 2015 11:27:17 -0800
>> Subject: Re: [DISCUSS] Orchestration in Falcon
>> From: [email protected]
>> To: [email protected]
>> 
>> On Thu, Jan 15, 2015 at 1:25 AM, Srikanth Sundarrajan <[email protected]>
>> wrote:
>> 
>>> [email protected]
>>> 
>>> It looks like we have broad consensus on this,
>> 
>> Really? Thats not how I read this? I'm still not sure its worth taking on
>> this complexity into Falcon. Did we even explore other options? I'm not
>> sure.
>> 
>> 
>>> should we open up a discuss thread on how we go about this ?
>> 
>> May be.
>> 
>> 
>>> Or should we create a confluence page and collaborate through that ?
>> Too early for this.
>> 
>> 
>>> 
>>> Regards
>>> Srikanth Sundarrajan
>>> 
>>>> From: [email protected]
>>>> Date: Thu, 1 Jan 2015 22:40:48 +0530
>>>> Subject: Re: [DISCUSS] Orchestration in Falcon
>>>> To: [email protected]
>>>> 
>>>> +1.
>>>> 
>>>> Few more relevant asks:
>>>> 1. Support for "Last Only" option for process scheduling (In addition to
>>>> LIFO/FIFO), currently oozie has some issues.
>>>> 2. Support for Singleton process (lock based), the behaviour of all
>>>> instances of process is same.
>>>> 
>>>> Thanks,
>>>> -Idris
>>>> 
>>>> 
>>>> On Thu, Jan 1, 2015 at 7:51 PM, Jean-Baptiste Onofré <[email protected]>
>>>> wrote:
>>>> 
>>>>> +1
>>>>> 
>>>>> Regards
>>>>> JB
>>>>> 
>>>>> 
>>>>>> On 12/31/2014 03:53 PM, Srikanth Sundarrajan wrote:
>>>>>> 
>>>>>> Can we pick up this thread in the new year when folks are back from
>>>>>> break? I am in total agreement with Venkatesh here. We ought to have
>>> a long
>>>>>> term sustainable approach. Also I feel that the capabilities that we
>>> would
>>>>>> like to enable on falcon and getting them done through oozie in near
>>> term
>>>>>> seems to be a tall ask anyways.
>>>>>> 
>>>>>> Regards
>>>>>> Srikanth Sundarrajan
>>>>>> 
>>>>>> Date: Tue, 23 Dec 2014 16:44:06 -0800
>>>>>>> Subject: Re: [DISCUSS] Orchestration in Falcon
>>>>>>> From: [email protected]
>>>>>>> To: [email protected]
>>>>>>> 
>>>>>>> Chugging along with Oozie is bad for Falcon in the long run, for
>>> users
>>>>>>> and
>>>>>>> developers. Its horribly complex to work through the many rough edges
>>>>>>> architecturally in Oozie. Look at all the patches for security that
>>> I had
>>>>>>> to fix around Oozie. Its unnecessarily very complex, non-uniform and
>>> is
>>>>>>> NOT
>>>>>>> meant to be used by another tool like Falcon but was built around end
>>>>>>> user.
>>>>>>> 
>>>>>>> This is a good discussion to have - may be explore oozie for
>>> short-term
>>>>>>> but
>>>>>>> look at alternative solutions for the long-term.
>>>>>>> 
>>>>>>> On Tue, Dec 23, 2014 at 7:28 AM, Srikanth Sundarrajan <
>>>>>>> [email protected]>
>>>>>>> wrote:
>>>>>>> 
>>>>>>> @jb, There is no doubt merit in mapping them to oozie if possible
>>> and if
>>>>>>>> extensions are simple and straight forward enough.
>>>>>>>> 
>>>>>>>> Also had a quick chat offline with Shwetha and she mentioned about
>>> some
>>>>>>>> work happening in Oozie in this regard. On further digging up, found
>>>>>>>> https://issues.apache.org/jira/browse/OOZIE-1976. This is possibly
>>> what
>>>>>>>> Shwetha was referring to. From the looks of it, this tries to
>>> address
>>>>>>>> item
>>>>>>>> #7 in the original thread.  May be there are more jiras where
>>> additional
>>>>>>>> work such as a-periodic datasets is being worked on. Perhaps
>>> @Shwetha
>>>>>>>> can
>>>>>>>> throw some light on what is being considered and/or how these
>>>>>>>> gating/orchestration use cases can be managed.
>>>>>>>> 
>>>>>>>> Regards
>>>>>>>> Srikanth Sundarrajan
>>>>>>>> 
>>>>>>>> Date: Tue, 23 Dec 2014 11:06:24 +0100
>>>>>>>>> From: [email protected]
>>>>>>>>> To: [email protected]
>>>>>>>>> Subject: Re: [DISCUSS] Orchestration in Falcon
>>>>>>>>> 
>>>>>>>>> Hi all,
>>>>>>>>> 
>>>>>>>>> I second Shwetha there. I think we can achieve such features in
>>> Oozie
>>>>>>>>> (with some adaptations).
>>>>>>>>> 
>>>>>>>>> Regards
>>>>>>>>> JB
>>>>>>>>> 
>>>>>>>>> Le 2014-12-23 10:53, Shwetha G S a écrit :
>>>>>>>>> 
>>>>>>>>>> If we can get rid of oozie entirely, yes we can explore other
>>>>>>>>>> possibilities. But if we are still going to use oozie for DAG
>>>>>>>>>> execution, we
>>>>>>>>>> are going to add add another bottleneck in the whole
>>>>>>>>>> execution(currently,
>>>>>>>>>> falcon is not in the workflow execution path) and I don't think
>>> its
>>>>>>>>>> worth
>>>>>>>>>> it.
>>>>>>>>>> 
>>>>>>>>>> The features that are outlined above are all available in basic
>>> forms
>>>>>>>>>> in
>>>>>>>>>> oozie and it should be easy to enhance them/make them as extension
>>>>>>>>>> points.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> -Shwetha
>>>>>>>>>> 
>>>>>>>>>> On Tue, Dec 23, 2014 at 8:12 AM, Srikanth Sundarrajan
>>>>>>>>>> <[email protected]>
>>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>> Here are few more gaps that we ought to solve for while we are
>>> on the
>>>>>>>>>>> subject:
>>>>>>>>>>> 
>>>>>>>>>>> 1. Ability to attach to start & finish events of workflow
>>> execution.
>>>>>>>>>>> Currently we have post processing hook to listen to finish
>>> events,
>>>>>>>>>>> but
>>>>>>>>>>> we
>>>>>>>>>>> do run into scenarios where there are occasional failures with
>>>>>>>>>>> post-processing and there is potential phase lag in learning
>>> about
>>>>>>>>>>> the
>>>>>>>>>>> events.
>>>>>>>>>>> 2. Strict enforcement of concurrency control possibly spanning
>>>>>>>>>>> process
>>>>>>>>>>> boundaries.
>>>>>>>>>>> 3. Ability to tune how backlogs have to be caught up (old
>>> instances
>>>>>>>>>>> to
>>>>>>>>>>> be
>>>>>>>>>>> given higher priority, newer instances to be given higher
>>> priority,
>>>>>>>>>>> or
>>>>>>>>>>> some
>>>>>>>>>>> sort of weights to allow both to make progress at varying rates).
>>>>>>>>>>> There
>>>>>>>>>>> have been asks for routing current vs older instances to
>>> different
>>>>>>>>>>> queues
>>>>>>>>>>> by users as an alternative.
>>>>>>>>>>> 4. Ability to have a notion of non-time based feed instances and
>>>>>>>>>>> related
>>>>>>>>>>> coordination.
>>>>>>>>>>> 5. Currently keeping track of and managing SLAs is also a
>>> challenge,
>>>>>>>>>>> but
>>>>>>>>>>> with #1 addressed, this might be a lesser concern.
>>>>>>>>>>> 
>>>>>>>>>>> Regards
>>>>>>>>>>> Srikanth Sundarrajan
>>>>>>>>>>> 
>>>>>>>>>>> Subject: Re: [DISCUSS] Orchestration in Falcon
>>>>>>>>>>>> From: [email protected]
>>>>>>>>>>>> Date: Tue, 23 Dec 2014 06:30:30 +0530
>>>>>>>>>>>> To: [email protected]
>>>>>>>>>>>> 
>>>>>>>>>>>> @venkatesh, the question really is how do we enable these
>>> gating pre
>>>>>>>>>>> conditions. Seems hard enough to add them to oozie, but am not
>>>>>>>>>>> intimately
>>>>>>>>>>> familiar with oozie to comment on how hard or easy it is. Like I
>>>>>>>>>>> responded
>>>>>>>>>>> to @ajay on the same thread, if we are to do away with
>>> coordination
>>>>>>>>>>> through
>>>>>>>>>>> oozie, we can follow up this discussion with approaches and
>>> design.
>>>>>>>>>>> Though
>>>>>>>>>>> I had quartz in my mind, wanted to leave that out of discussion
>>> to
>>>>>>>>>>> see
>>>>>>>>>>> if
>>>>>>>>>>> there is consensus for moving away from oozie coords and
>>> implementing
>>>>>>>>>>> them
>>>>>>>>>>> through other means.
>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Sent from my iPhone
>>>>>>>>>>>> 
>>>>>>>>>>>> On 23-Dec-2014, at 1:16 am, "Seetharam Venkatesh" <
>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>>> What is the purpose of this decoupling? Why build this into
>>>>>>>>>>>> Falcon?
>>>>>>>> 
>>>>>>>>> Scheduling is so common that there are dime a dozen schedulers
>>>>>>>>>>>> today
>>>>>>>> 
>>>>>>>>> and
>>>>>>>>>>> 
>>>>>>>>>>>> they are all extensible with custom triggers. Making it part of
>>>>>>>>>>>> Falcon
>>>>>>>> 
>>>>>>>>> will
>>>>>>>>>>> 
>>>>>>>>>>>> suffer the same issues that Oozie has today.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I'm sorry but I'm a HUGE -1 to this being built into Falcon
>>>>>>>>>>>> codebase.
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>>>>> However, I'm +1 to reusing Quartz scheduler that already
>>> exists -
>>>>>>>>>>>> stand it
>>>>>>>>>>> 
>>>>>>>>>>>> up outside or embed it like we do for active MQ.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Phase 2 - I'd like to see we write a simple DAG execution
>>> layer in
>>>>>>>>>>>> YARN as
>>>>>>>>>>> 
>>>>>>>>>>>> an app master with out DB and keeps state on HDFS as an
>>> alternate
>>>>>>>>>>>> to
>>>>>>>> 
>>>>>>>>> Oozie.
>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>>> Then we will have a nimble falcon which can kick ass.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Sun, Dec 21, 2014 at 6:13 AM, Srikanth Sundarrajan <
>>>>>>>>>>>> [email protected]>
>>>>>>>>>>> 
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Hello Team,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Since its inception Falcon has used Oozie for process
>>>>>>>>>>>>> orchestration as
>>>>>>>> 
>>>>>>>>> well as feed life cycle phase executions, while this has worked
>>>>>>>>>>>>> reasonably
>>>>>>>>>>> 
>>>>>>>>>>>> and allowed to make higher level capabilities available through
>>>>>>>>>>>>> Falcon, we
>>>>>>>>>>> 
>>>>>>>>>>>> are increasing seeing scenarios where this is proving to be a
>>>>>>>>>>>>> limiting
>>>>>>>> 
>>>>>>>>> factor. In its current form, Falcon relies on Oozie for both
>>>>>>>>>>>>> scheduling and
>>>>>>>>>>> 
>>>>>>>>>>>> for workflow execution, due to which the scheduling is limited
>>>>>>>>>>>>> to time
>>>>>>>> 
>>>>>>>>> based/cron based scheduling with additional gating conditions on
>>>>>>>>>>>>> data
>>>>>>>> 
>>>>>>>>> availability. Also this imposes restrictions on datesets being
>>>>>>>>>>>>>> periodic/cyclic in nature.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> From an orchestration stand point, it would help if we can
>>>>>>>>>>>>> support
>>>>>>>> 
>>>>>>>>> standard gating / scheduling primitives via Falcon:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 1. Simple periodic scheduling with no gating conditions
>>>>>>>>>>>>>> 2. Cron based scheduling (day of week, day of the month,
>>> specific
>>>>>>>>>>>>> hours
>>>>>>>>>>> 
>>>>>>>>>>>> and non-periodic) with no gating conditions
>>>>>>>>>>>>>> 3. Availability of new data (assuming monotonically increasing
>>>>>>>>>>>>> data
>>>>>>>> 
>>>>>>>>> version, availavility of new versions)
>>>>>>>>>>>>>> 4. Changes to existing data (reinstatement - similar to late
>>> data
>>>>>>>>>>>>> handling)
>>>>>>>>>>> 
>>>>>>>>>>>> 5. External trigger/notifications
>>>>>>>>>>>>>> 6. Availability of specific instances of data as declared as
>>>>>>>>>>>>> mandatory
>>>>>>>> 
>>>>>>>>> dependency
>>>>>>>>>>>>>> 7. Availability of a minimum subset of instances of data
>>>>>>>>>>>>> declared as
>>>>>>>> 
>>>>>>>>> mandatory depedency (at least 10 hourly instances of a day with
>>>>>>>>>>>>> 24
>>>>>>>> 
>>>>>>>>> instances for ex)
>>>>>>>>>>>>>> 8. Valid combinations of the above.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> In this context, I would like to propose that we move away
>>> from
>>>>>>>>>>>>> Oozie
>>>>>>>> 
>>>>>>>>> for
>>>>>>>>>>> 
>>>>>>>>>>>> the orchestration requirements and have them implemented
>>> natively
>>>>>>>>>>>>> within
>>>>>>>>>>> 
>>>>>>>>>>>> Falcon. It will no doubt make Falcon server bulkier and heavier
>>>>>>>>>>>>> in
>>>>>>>> 
>>>>>>>>> both
>>>>>>>>>>> 
>>>>>>>>>>>> code and deployment, but seems like without it, the
>>> orchestration
>>>>>>>>>>>>> within
>>>>>>>>>>> 
>>>>>>>>>>>> Falcon will be limited by capabilities available within Oozie.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Please do note that this suggestion is restricted to the
>>>>>>>>>>>>> scheduling
>>>>>>>> 
>>>>>>>>> and
>>>>>>>>>>> 
>>>>>>>>>>>> not to the workflow execution.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Would like to hear from fellow developers and users on what
>>> your
>>>>>>>>>>>>> thoughts
>>>>>>>>>>> 
>>>>>>>>>>>> are. Please do chime in with your views.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Regards
>>>>>>>>>>>>>> Srikanth Sundarrajan
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>> Venkatesh
>>>>>>>>>>>>> 
>>>>>>>>>>>>> “Perfection (in design) is achieved not when there is nothing
>>>>>>>>>>>> more to
>>>>>>>> 
>>>>>>>>> add,
>>>>>>>>>>> 
>>>>>>>>>>>> but rather when there is nothing more to take away.”
>>>>>>>>>>>>> - Antoine de Saint-Exupéry
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> Regards,
>>>>>>> Venkatesh
>>>>>>> 
>>>>>>> “Perfection (in design) is achieved not when there is nothing more to
>>>>>>> add,
>>>>>>> but rather when there is nothing more to take away.”
>>>>>>> - Antoine de Saint-Exupéry
>>>>> --
>>>>> Jean-Baptiste Onofré
>>>>> [email protected]
>>>>> http://blog.nanthrax.net
>>>>> Talend - http://www.talend.com
>> 
>> 
>> 
>> -- 
>> Regards,
>> Venkatesh
>> 
>> “Perfection (in design) is achieved not when there is nothing more to add,
>> but rather when there is nothing more to take away.”
>> - Antoine de Saint-Exupéry
>                         

Reply via email to