Hi Upeksha, I will change my implementation [1] accordingly.
[1] https://github.com/apache/airavata/compare/develop...yasgun:ochestrator-refactoring Thank You On Sat, Jul 7, 2018 at 10:09 PM DImuthu Upeksha <dimuthu.upeks...@gmail.com> wrote: > Hi Yasas, > > I my preference is to go with first approach. It looks simple and clean. > Having two notations as Workflow and Experiment might confuse the users > also. I agree that APIs should be changed but we can preserve old APIs for > some time by manually mapping them to new APIs. Can you share your > currently working on fork with this thread as well? > > Thanks > Dimuthu > > On Sat, Jul 7, 2018 at 12:50 AM, Yasas Gunarathne < > yasasgunarat...@gmail.com> wrote: > >> Hi All, >> >> Thank you for the information. I will consider the explained scenarios in >> the process of modifying the Orchestrator with workflow capabilities. >> >> Apart from that, I have few issues to be clarified regarding the API >> level implementation. *ExperimentModel* has *ExperimentType* enum which >> includes two basic types; *SINGLE_APPLICATION* and *WORKFLOW*. According >> to this, experiment can be a single application or a workflow (which may >> include multiple applications). But the other parameters in the experiment >> model are defined considering it only as a single application. Therefore, >> the actual meaning of experiment, needs to be clarified in order to >> continue with the API level implementation. >> >> There are two basic options, >> 1. Modifying *ExperimentModel* to support workflows (which causes all >> client side implementations to be modified) >> [image: 1.png] >> 2. Defining a separate *WorkflowModel* for workflow execution and >> removing *ExperimentType* parameter from *ExperimentModel* to avoid >> confusion. >> [image: 2.png] >> Please provide any suggestions regarding these two options or any other >> alternative if any. For the moment, I am working on creating a separate >> *WorkflowModel >> *(which is little bit similar to XBaya *WorkflowModel*). >> >> Regards >> >> On Mon, Jun 4, 2018 at 8:41 PM Pamidighantam, Sudhakar <pamid...@iu.edu> >> wrote: >> >>> Some times the workflow crashes and/or ends unfinished which is >>> probably more like scenario 2. In those cases also one has to restart the >>> workflow from the point where it stopped. >>> So a workflow state needs to be maintained along with the data needed >>> and where it might be available when a restart is required. It not strictly >>> cloning and rerunning an old workflow but restarting in the middle of an >>> execution. >>> >>> Thanks, >>> Sudhakar. >>> >>> On Jun 4, 2018, at 10:43 AM, DImuthu Upeksha <dimuthu.upeks...@gmail.com> >>> wrote: >>> >>> Hi Yasas, >>> >>> Thanks for the summary. As now you have a clear idea about what you have >>> to do, let's move on to implement a prototype that validates your workflow >>> blocks so that we can give our feedbacks constructively. >>> >>> Hi Sudhakar, >>> >>> Based on your question, I can imagine two scenarios. >>> >>> 1. Workflow is paused in the middle and resumed when required. >>> This is straightforward if we use Helix api directly >>> >>> 2. Workflow is stopped permanently and do a fresh restart of the >>> workflow. >>> As far as I have understood, Helix currently does not have a workflow >>> cloning capability, So we might have to clone it from our side and instruct >>> Helix to run it as a new workflow. Or we can extend Helix api to support >>> workflow cloning which is the cleaner and ideal way. However it might need >>> some understanding of Helix code base and proper testing. So for the >>> time-being, let's go with the first approach. >>> >>> Thanks >>> Dimuthu >>> >>> On Sun, Jun 3, 2018 at 7:35 AM, Pamidighantam, Sudhakar <pamid...@iu.edu >>> > wrote: >>> >>>> Is there a chance to include workflow restarter (where it was stopped >>>> earlier) in the tasks. >>>> >>>> Thanks, >>>> Sudhakar. >>>> >>>> On Jun 2, 2018, at 11:52 PM, Yasas Gunarathne < >>>> yasasgunarat...@gmail.com> wrote: >>>> >>>> Hi Suresh and Dimuthu, >>>> >>>> Thank you very much for the clarifications and suggestions. Based on >>>> them and other Helix related factors encountered during the implementation >>>> process, I updated and simplified the structure of workflow execution >>>> framework. >>>> >>>> *1. Airavata Workflow Manager* >>>> >>>> Airavata Workflow Manager is responsible for accepting the workflow >>>> information provided by the user, creating a Helix workflow with task >>>> dependencies, and submitting it for execution. >>>> >>>> >>>> *2. Airavata Workflow Data Blocks* >>>> >>>> Airavata Workflow Data Blocks are saved in JSON format as user contents >>>> in Helix workflow scope. These blocks contain the links of input data of >>>> the user, replica catalog entries of output data, and other information >>>> that are required for the workflow execution. >>>> >>>> >>>> *3. Airavata Workflow Tasks* >>>> *3.1. Operator Tasks* >>>> >>>> *i. Flow Starter Task* >>>> >>>> Flow Starter Task is responsible for starting a specific branch of the >>>> Airavata Workflow. In a single Airavata Workflow there can be multiple >>>> starting points. >>>> >>>> *ii. Flow Terminator Task* >>>> >>>> Flow Terminator Task is responsible for terminating a specific branch >>>> of the Airavata workflow. In a single workflow there can be multiple >>>> terminating points. >>>> >>>> *iii. Flow Barrier Task* >>>> >>>> Flow Barrier Task works as a waiting component at a middle of a >>>> workflow. For example if there are two experiments running and the results >>>> of both experiments are required to continue the workflow, barrier waits >>>> for both experiments to be completed before continuing. >>>> >>>> *iv. Flow Divider Task* >>>> >>>> Flow Divider Task opens up new branches of the workflow. >>>> >>>> *v. Condition Handler Task* >>>> >>>> Condition Handler Task is the path selection component of the workflow. >>>> >>>> >>>> *3.2. Processor Tasks* >>>> >>>> These components are responsible for triggering Orchestrator to perform >>>> specific processes (ex: experiments / data processing activities). >>>> >>>> >>>> *3.3. Loop Tasks* >>>> >>>> *i. Foreach Loop Task* >>>> *ii. Do While Loop Task* >>>> >>>> >>>> Regards >>>> >>>> On Mon, May 21, 2018 at 4:01 PM Suresh Marru <sma...@apache.org> wrote: >>>> >>>>> Hi Yasas, >>>>> >>>>> This is good detail, I haven’t digested it all, but a quick feedback. >>>>> Instead of connecting multiple experiments within a workflow which will be >>>>> confusing from a user point of view, can you use the following >>>>> terminology: >>>>> >>>>> * A computational experiment may have a single application execution >>>>> or multiple (workflow). >>>>> >>>>> ** So an experiment may correspond to a single application execution, >>>>> multiple application execution or even multiple workflows nested amongst >>>>> them (hierarchal workflows) To avoid any confusion, lets call these units >>>>> of execution as a Process. >>>>> >>>>> A process is an abstract notion for a unit of execution without going >>>>> into implementing details and it described the inputs and outputs. For an >>>>> experiment with single application, experiment and process have one on one >>>>> correspondence, but within a workflow, each step is a Process. >>>>> >>>>> Tasks are the implementation detail of a Process. >>>>> >>>>> So the change in your Architecture will be to chain multiple processes >>>>> together within an experiment and not to chain multiple experiments. Does >>>>> it make sense? You can also refer to the attached figure which illustrate >>>>> these from a data model perspective. >>>>> >>>>> Suresh >>>>> >>>>> P.S. Over all, great going in mailing list communications, keep’em >>>>> coming. >>>>> >>>>> >>>>> On May 21, 2018, at 1:25 AM, Yasas Gunarathne < >>>>> yasasgunarat...@gmail.com> wrote: >>>>> >>>>> Hi Upeksha, >>>>> >>>>> Thank you for the information. I have identified the components that >>>>> needs to be included in the workflow execution framework. Please add if >>>>> anything is missing. >>>>> >>>>> *1. Airavata Workflow Message Context* >>>>> >>>>> Airavata Workflow Message Context is the common data structure that is >>>>> passing through all Airavata workflow components. Airavata Workflow >>>>> Message >>>>> Context includes the followings. >>>>> >>>>> >>>>> - *Airavata Workflow Messages *- This contains the actual data >>>>> that needs to be transferred through the workflow. Content of the >>>>> Airavata >>>>> Workflow Messages can be modified at Airavata Workflow Components. >>>>> Single >>>>> Airavata Workflow Message Context can hold multiple Airavata Workflow >>>>> Messages, and they will be stored as key-value pairs keyed by the >>>>> component >>>>> id of the last modified component. (This is required for the Airavata >>>>> Flow >>>>> Barrier) >>>>> - *Flow Monitoring Information* - Flow Monitoring Information >>>>> contains the current status and progress of the workflow. >>>>> - *Parent Message Contexts *- Parent Message Contexts includes the >>>>> preceding Airavata Workflow Message Contexts if the current message >>>>> context >>>>> is created at the middle of the workflow. For example Airavata Flow >>>>> Barriers and Airavata Flow Dividers create new message contexts >>>>> combining >>>>> and copying messages respectively. In such cases new message contexts >>>>> will >>>>> include its parent message context/s to this section. >>>>> - *Child Message Contexts* - Child Message Contexts includes the >>>>> succeeding Airavata Workflow Message Contexts if other message >>>>> contexts are >>>>> created at the middle of the workflow using the current message >>>>> context. >>>>> For example Airavata Flow Barriers and Airavata Flow Dividers create >>>>> new >>>>> message contexts combining and copying messages respectively. In such >>>>> cases >>>>> current message contexts will include its child message context/s to >>>>> this >>>>> section. >>>>> - *Next Airavata Workflow Component* - Component ID of the next >>>>> Airavata Workflow Component. >>>>> >>>>> >>>>> *2. Airavata Workflow Router* >>>>> >>>>> Airavata Workflow Router is responsible for keeping track of Airavata >>>>> Workflow Message Contexts and directing them to specified Airavata >>>>> Workflow >>>>> Components. >>>>> >>>>> >>>>> *3. Airavata Workflow Components* >>>>> >>>>> *i. Airavata Workflow Operators* >>>>> >>>>> >>>>> - *Airavata Flow Starter *- This is responsible for starting a >>>>> specific branch of the Airavata Workflow. In a single Airavata Workflow >>>>> there can be multiple starting points. This component creates a new >>>>> Airavata Workflow Message Context and registers it to Airavata Workflow >>>>> Router. >>>>> - Configurations >>>>> - Next Airavata Workflow Component >>>>> - Input Dataset File >>>>> - *Airavata Flow Terminator* - This is responsible for >>>>> terminating a specific branch of the Airavata workflow. In a single >>>>> workflow there can be multiple terminating points. >>>>> - Configurations >>>>> - Output File Location >>>>> - *Airavata Flow Barrier* - Airavata Flow Barrier works as a >>>>> waiting component at a middle of a workflow. For example if there are >>>>> two >>>>> experiments running and the results of both experiments are required to >>>>> continue the workflow, barrier waits for both experiments to be >>>>> completed >>>>> before continuing. Within this component multiple Airavata Workflow >>>>> Messages should be packaged into a new Airavata Workflow Message >>>>> Context. >>>>> - Configurations >>>>> - Components to wait on >>>>> - Next Airavata Workflow Component >>>>> - *Airavata Flow Divider* - Airavata Flow Divider opens up >>>>> new branches of the workflow. It is responsible for sending copies of >>>>> Airavata Message to those branches separately. >>>>> - Configurations >>>>> - Next components to send copies >>>>> - *Airavata Condition Handler* - Airavata Condition Handler is >>>>> the path selection component of the workflow. This component is >>>>> responsible >>>>> for checking Airavata Message Context for conditions and directing it >>>>> to >>>>> required path of the workflow. >>>>> - Configurations >>>>> - Possible Next Airavata Workflow Components >>>>> >>>>> >>>>> *ii. Airavata Experiments* >>>>> >>>>> These components are responsible for triggering current task execution >>>>> framework to perform specific experiments. >>>>> >>>>> >>>>> *iii. Airavata Data Processors* >>>>> >>>>> These components are responsible for processing data in the middle of >>>>> a workflow. Sometimes output data of an experiment needs to be processed >>>>> before sending to other experiments as inputs. >>>>> >>>>> >>>>> *iv. Airavata Loops* >>>>> >>>>> >>>>> - *Airavata Foreach Loop* - This loop can be paralleled >>>>> - *Airavata Do While Loop* - This loop cannot be paralleled >>>>> >>>>> >>>>> As we have discussed, I am planning to implement this Airavata >>>>> Workflow Execution Framework using Apache Helix. To get a more clear >>>>> understanding about the project it is better if you can provide some >>>>> information about the experiment types (such as Echo, Gaussian) and data >>>>> input and output formats of these experiments. >>>>> >>>>> If we need to process data (see Airavata Data Processor) when >>>>> connecting two experiments in the workflow, it should also be done within >>>>> super computers. I need to verify whether there is any implementation >>>>> available for data processing currently within Airavata. >>>>> >>>>> Following diagram shows an example workflow without loops. It is >>>>> better if you can explain a bit more about the required types of loops >>>>> within Airavata workflow. >>>>> >>>>> <airavata-workflow-1.png> >>>>> >>>>> Regards >>>>> >>>>> >>>>> On Tue, May 1, 2018 at 9:06 PM, DImuthu Upeksha < >>>>> dimuthu.upeks...@gmail.com> wrote: >>>>> >>>>>> Hi Yasas, >>>>>> >>>>>> This is really good. You have captured the problem correctly and >>>>>> provided a good visualization too. As we have discussed in the GSoC >>>>>> student >>>>>> meeting, I was wondering whether we can compose these workflows as Helix >>>>>> Tasks as well (One task per Experiment). Only thing that worries to me is >>>>>> how we can implement a barrier as mentioned in your second wrokflow using >>>>>> current Task framework. We might have to improve the task framework to >>>>>> support that. >>>>>> >>>>>> And we might need to think about constant / conditional loops and >>>>>> conditional (if else) paths inside the workflows. Please update the >>>>>> diagram >>>>>> accordingly for future references. >>>>>> >>>>>> You are in the right track. Keep it up. >>>>>> >>>>>> Thanks >>>>>> Dimuthu >>>>>> >>>>>> >>>>>> On Sun, Apr 29, 2018 at 1:57 AM, Yasas Gunarathne < >>>>>> yasasgunarat...@gmail.com> wrote: >>>>>> >>>>>>> Hi All, >>>>>>> >>>>>>> Thank you very much for the information. I did a research on the >>>>>>> internals of orchestrator and the new helix based workflow (lower level) >>>>>>> execution, throughout past few weeks. >>>>>>> >>>>>>> Even though helix supports adding any number of experiments (i.e. >>>>>>> complete experiments including pre and post workflows) chained >>>>>>> together, it >>>>>>> is required to maintain a higher level workflow manager as a separate >>>>>>> layer >>>>>>> for orchestrator and submit experiments one after the other (if cannot >>>>>>> be >>>>>>> run parallely) or parallelly (if execution is independent) to preserve >>>>>>> the >>>>>>> fault tolerance and enable flow handling of the higher level workflow. >>>>>>> >>>>>>> Therefore, the steps that the new layer of orchestrator is supposed >>>>>>> to do will be, >>>>>>> >>>>>>> 1. Parsing the provided high level workflow schema and arrange >>>>>>> the list of experiments. >>>>>>> 2. Sending experiments according to the provided order and save >>>>>>> their results in the storage resource. >>>>>>> 3. If there are dependencies (Result of an experiment is >>>>>>> required to generate the input for another experiment), managing them >>>>>>> accordingly while providing support to do modifications to the >>>>>>> results in >>>>>>> between. >>>>>>> 4. Providing flow handling methods (Start, Stop, Pause, Resume, >>>>>>> Restart) >>>>>>> >>>>>>> I have attached few simple top level workflow examples to support >>>>>>> the explanation. Please provide your valuable suggestions. >>>>>>> >>>>>>> Regards >>>>>>> >>>>>>> >>>>>>> On Mon, Mar 26, 2018 at 8:43 AM, Suresh Marru <sma...@apache.org> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi Yasas, >>>>>>>> >>>>>>>> Dimuthu already clarified but let me add few more points. >>>>>>>> >>>>>>>> Thats a very good questions, interpreter vs compiler (in the >>>>>>>> context of workflows). Yes Airavata historically took the interpreter >>>>>>>> approach, where after execution of each node in a workflows, the >>>>>>>> execution >>>>>>>> comes back to the enactment engine and re-inspects the state. This >>>>>>>> facilitated user interactively through executions. Attached state >>>>>>>> transition diagram might illustrate it more. >>>>>>>> >>>>>>>> Back to the current scope, I think you got the over all goal >>>>>>>> correct and your approach is reasonable. There are some details which >>>>>>>> are >>>>>>>> missing but thats expected. Just be aware that if your project is >>>>>>>> accepted, >>>>>>>> you will need to work with airavata community over the summer and >>>>>>>> refine >>>>>>>> the implementation details as you go. You are on a good start. >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Suresh >>>>>>>> <workflow-states.png> >>>>>>>> >>>>>>>> >>>>>>>> On Mar 25, 2018, at 8:44 PM, DImuthu Upeksha < >>>>>>>> dimuthu.upeks...@gmail.com> wrote: >>>>>>>> >>>>>>>> Hi Yasas, >>>>>>>> >>>>>>>> I'm not a expert in XBaya design and use cases but I think Suresh >>>>>>>> can shed some light about it. However we no longer use XBaya for >>>>>>>> workflow >>>>>>>> interpretation. So don't get confused with the workflows defined in >>>>>>>> XBaya >>>>>>>> with the description provided in the JIRA ticket. Let's try to make the >>>>>>>> concepts clear. We need two levels of Workflows. >>>>>>>> >>>>>>>> 1. To run a single experiment of an Application. We call this as a >>>>>>>> DAG. So a DAG is statically defined. It can have a set of environment >>>>>>>> setup >>>>>>>> tasks, data staging tasks and a job submission task. For example, a >>>>>>>> DAG is >>>>>>>> created to run a Gaussian experiment on a compute host >>>>>>>> 2. To make a chain of Applications. This is what we call an actual >>>>>>>> Workflow. In a workflow you can have a Gaussian Experiment running and >>>>>>>> followed by a Lammps Experiment. So this is a dynamic workflow. Users >>>>>>>> can >>>>>>>> come up with different combinations of Applications as a workflow >>>>>>>> >>>>>>>> However your claim is true about pausing and restarting workflows. >>>>>>>> Either it is a statically defined DAG or a dynamic workflow, we should >>>>>>>> be >>>>>>>> able to do those operations. >>>>>>>> >>>>>>>> I can understand some of the words and terminologies that are in >>>>>>>> those resources are confusing and unclear so please feel free to let us >>>>>>>> know if you need anything to be clarified. >>>>>>>> >>>>>>>> Thanks >>>>>>>> Dimuthu >>>>>>>> >>>>>>>> On Sun, Mar 25, 2018 at 2:45 AM, Yasas Gunarathne < >>>>>>>> yasasgunarat...@gmail.com> wrote: >>>>>>>> >>>>>>>>> Hi All, >>>>>>>>> >>>>>>>>> I have few questions to be clarified regarding the user-defined >>>>>>>>> workflow execution in Apache Airavata. Here I am talking about the >>>>>>>>> high >>>>>>>>> level workflows that are used to chain together multiple >>>>>>>>> applications. This >>>>>>>>> related to the issue - Airavata-2717 [1]. >>>>>>>>> >>>>>>>>> In this [2] documentation it says that, the workflow interpreter >>>>>>>>> that worked with XBaya provided an interpreted workflow execution >>>>>>>>> framework >>>>>>>>> rather than the compiled workflow execution environments, which >>>>>>>>> allowed the >>>>>>>>> users to pause the execution of the workflow as necessary and update >>>>>>>>> the >>>>>>>>> DAG’s execution states or even the DAG itself and resume execution. >>>>>>>>> >>>>>>>>> I want to know the actual requirement of having an interpreted >>>>>>>>> workflow execution at this level. Is there any domain level advantage >>>>>>>>> in >>>>>>>>> allowing users to modify the order of workflow at runtime? >>>>>>>>> >>>>>>>>> I think we can have, pause, resume, restart, and stop commands >>>>>>>>> available even in a compiled workflow execution environment, as long >>>>>>>>> as we >>>>>>>>> don't need to change the workflow. >>>>>>>>> >>>>>>>>> [1] https://issues.apache.org/jira/browse/AIRAVATA-2717 >>>>>>>>> [2] http://airavata.apache.org/architecture/workflow.html >>>>>>>>> >>>>>>>>> Regards >>>>>>>>> -- >>>>>>>>> *Yasas Gunarathne* >>>>>>>>> Undergraduate at Department of Computer Science and Engineering >>>>>>>>> Faculty of Engineering - University of Moratuwa Sri Lanka >>>>>>>>> LinkedIn <https://www.linkedin.com/in/yasasgunarathne/> | GitHub >>>>>>>>> <https://github.com/yasgun> | Mobile : +94 77 4893616 >>>>>>>>> <+94%2077%20489%203616> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> *Yasas Gunarathne* >>>>>>> Undergraduate at Department of Computer Science and Engineering >>>>>>> Faculty of Engineering - University of Moratuwa Sri Lanka >>>>>>> LinkedIn <https://www.linkedin.com/in/yasasgunarathne/> | GitHub >>>>>>> <https://github.com/yasgun> | Mobile : +94 77 4893616 >>>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> *Yasas Gunarathne* >>>>> Undergraduate at Department of Computer Science and Engineering >>>>> Faculty of Engineering - University of Moratuwa Sri Lanka >>>>> LinkedIn <https://www.linkedin.com/in/yasasgunarathne/> | GitHub >>>>> <https://github.com/yasgun> | Mobile : +94 77 4893616 >>>>> >>>>> >>>>> >>>> >>>> -- >>>> *Yasas Gunarathne* >>>> Undergraduate at Department of Computer Science and Engineering >>>> Faculty of Engineering - University of Moratuwa Sri Lanka >>>> LinkedIn <https://www.linkedin.com/in/yasasgunarathne/> | GitHub >>>> <https://github.com/yasgun> | Mobile : +94 77 4893616 >>>> >>>> >>>> >>> >>> >> >> -- >> *Yasas Gunarathne* >> Undergraduate at Department of Computer Science and Engineering >> Faculty of Engineering - University of Moratuwa Sri Lanka >> LinkedIn <https://www.linkedin.com/in/yasasgunarathne/> | GitHub >> <https://github.com/yasgun> | Mobile : +94 77 4893616 >> > > -- *Yasas Gunarathne* Undergraduate at Department of Computer Science and Engineering Faculty of Engineering - University of Moratuwa Sri Lanka LinkedIn <https://www.linkedin.com/in/yasasgunarathne/> | GitHub <https://github.com/yasgun> | Mobile : +94 77 4893616