On Sep 20, 2018, at 12:15 AM, Yasas Gunarathne 
<[email protected]<mailto:[email protected]>> wrote:

In the beginning, I tried to use the current ExperimentModel to implement 
workflows since it has workflow related characteristics as you have mentioned. 
It seemed to be designed at first keeping the workflow as a primary focus 
including even ExperimentType.WORKFLOW. But, apart from that and the database 
level one-to-many relationship with processes, there is no significant support 
provided for workflows.

I believe processes should be capable of executing independently at their level 
of abstraction. But, in the current architecture processes execute some 
experiment related parts going beyond their scope. For example, saving 
experiment output along with process output after completing the process, which 
is not required for workflows. Here, submitting a message to indicate the 
process status should be enough.


I think Sudhakar addressed a lot of your questions, but here are some 
additional thoughts:

Processes just execute a set of tasks, which are specified by the Orchestrator. 
For workflows I would expect the Orchestrator to create a list of processes 
that each have a set of tasks that make sense for the running of the workflow.  
For example, regarding saving experiment output, the Orchestrator could either 
create a process to save the experiment output or have the terminal process in 
the workflow have a final task to save the experiment output.

If processes can execute independently, it doesn't need to keep experiment_id 
within itself in the table. Isn't it the responsibility of whatever the outer 
layer (Experiment/Workflow) to keep this mapping? WDYT? :)

Possibly. I wonder how this relates to the recent data parsing efforts.  It 
does make sense that we might want processes to execute independently because 
we do have the use case of running task dags separate from any experiment-like 
context.

As you have mentioned we can keep an additional Experiment within Workflow 
Application to keeping the current Process execution unchanged. (Here the 
experiment is still executing a single application.) Is that what you meant?


Not quite. I was suggesting that the Experiment is the workflow instance, 
having a list of processes where each process executes an application 
(corresponding roughly to nodes in the workflow dag).

Thanks,

Marcus

Reply via email to