Thanks for doing that Marlon, I have a couple of questions, sorry for my naivete.
Is the term CPI specific to Airavata? I have not heard it before. When you state: "Note jobs can complete but tasks can fail." do you mean: "Note that although jobs within a given task can complete, the task they are contained in may fail."? Question: Is it really possible for a user-initiated activity to be persistent? Maybe I don't understand the use case or language. I wonder how scalable it can be for user-initiated activity to persist. Question: If jobs can be persistent, tasks and workflows may also be persistent, right? This also seems potentially like an issue, if my understanding of persistence is correct. For the Orchestrator: If it knows the status of XSEDE and other resources, do we know how it gets that information? Is there a specific way it plugs in to other remote resources that ensures that info is provided (in other words, there are many kinds of resources, and perhaps many ways of broadcasting their condition; or maybe it is just online/offline?) Also, if Orchestrator knows the status of the remote resources, can it pass that information forward to the Gateway front end, so it can be printed in the user interface somewhere? From my perspective, it is way cooler if the user knows before submitting that there will be a delay, or re-routing of their job. Mark -----Original Message----- From: Marlon Pierce [mailto:[email protected]] Sent: Wednesday, May 14, 2014 8:02 AM To: [email protected]; [email protected] Subject: Orchestrator description draft Dear all-- I've written up a draft description of the Orchestrator [1] and welcome comments and critiques. As with the GFAC description, this is not necessarily based on the current implementation. The purpose is to create an implementation-independent description of the Orchestrator for future reference. Some outcomes from this exercise: * The interactions of the Workflow Interpreter, Orchestrator, and API server need to be thought out. Don't take my suggestions here too seriously. * The scheduler component of the Orchestrator needs more thought, especially if there are multiple Orchestrators running (for load balancing): we don't want to run into "thread" issues if multiple schedulers are trying to work with the registry. * Our current concept for extending the Orchestrator is to extend the CPI. You would do this to implement, for example, more sophisticated scheduling. But we could take a GFAC approach of having a core and developer-provided plugins (for scheduling, quality of service, etc). Marlon [1] https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=40511565
