The current GFAC providers all execute tasks in "blocking" mode: the provider stays active until the job terminates. This introduces some tradeoffs. On the one hand, determining the job state is very provider-specific. Doing it all in the provider makes things relatively simple to implement.
On the other hand, this makes Airavata's state complicated. This increases the difficulty of handling fault recovery and "elastic" scenarios, where we may need to restart failed servers, pass work from one running instance to another, and so forth. If we wanted to make the provider stateless and move monitoring to a different place, this would take some thoughtful design--I don't have an idea of the scope--so even if we all agreed it is a good idea, we have to overcome an energy barrier of a current system that is good enough for what we need to do. What are your thoughts? We had a related discussion about this for a specific use case back in July [1]. Marlon [1] http://mail-archives.apache.org/mod_mbox/airavata-dev/201307.mbox/%[email protected]%3E
