While I won't claim to have done a thorough examination of the proposal to use the IBM tool for developing workflows, I am concerned about several items relating to configuration management. There are several articles in the new Comm. ACM that bear on security and configuration management. (CACM, Vol. 57, No. 9, 2014). I'd highly recommend getting a copy and taking a look at the articles in the middle of the issue.
1. Kern, C., 2014: Securing the Tangled Web, CACM, 57, 38-67 presents a view of security issues due to script injection vulnerabilities that makes JSON and other technologies that use Java Script less secure than one would like. Kern is an information security engineer for Google. He discusses not only the nature of the XSS vulnerabilities, but also work Google has undertaken to reduce their risk. These include building in special character exception handling, designing and testing automated templates for interface designers to use, and project management enforcement of strict disciplines that forbid use of vulnerable software. Unfortunately, the cures add to the leaning curve for using these tools - and increase the maintenance cost of software because they need to be applied "forever". 2. Workflows (or, in normal project management nomenclature, Work Breakdown Structures) are graphs whose complexity increases markedly as more activities and objects get included. If one is aiming for high integrity or fully replicable and transparent software systems, one must maintain the ability to retain configuration. The old NCAR FORTRAN manuals (ca. 1980) had a cover that embedded the notion "It ran yesterday. It's been running for years. I only changed one card." This means that software that is updated (by revisions due to concerns over security or to make other improvements) could require verification that the updates haven't changed numerical values. Based on my personal experience with Ubuntu Linux (or Windows - whatever), updates occur on at least a weekly basis, with the organizations responsible for their software deciding when to send out updates. This rate of update makes the Web a pretty volatile environment. In most organizations that have system administrators, they bear the burden this turmoil creates. End users may not realize the impact, but it costs time and attention to avoid being overwhelmed. 3. In many of the software packages we use, the organization providing the software manages package updates with a centralized package manager. In Linux, Debian (and the derivative Ubuntu family of software) uses one centralized manager to produce the .deb packages that contain appropriate provenance metadata for maintaining configuration. Red Hat and SuSE Linux use an alternative format for the RPM package with its metadata format. These package managers do not operate in the same way. For example, if you want to ingest RPM packages into Ubuntu, you have to install a package called alien and use that to convert the RPM to .deb formats. The same pleasantries affect Java, databases, and Web standards. Because some of these organizations are real commercial enterprises making their money from customers outside of the federal contracting venue, it seems unlikely that expecting funding agencies will develop one common standard for configuration management. While funding agencies might think a single standard for configuration would solve their problems, that would require an unprecedented degree of cooperation between agencies, data producers, and data users. The time scale for reaching agreements on this kind of "social engineering" is almost certainly at least a decade, during which the technological basis in hardware and software will have evolved out from under the agencies. I suspect that the security issues relating to JSON and such are the immediate concern. On a slightly longer time frame, it's important to remember the complexity of workflow scaling makes a single tool unlikely. A solution for data production with short chains of objects that are relatively isolated (single investigator conducting a few investigations per year) is vastly different from production flows such as weather forecasting or some kinds of climate data production (large teams of software developers and scientists (100's of people) running 1000's of jobs per day). Configuration management for the latter kinds of project requires building group cultures that recognize the importance of managing the configuration - and does take up a lot of time - even for the scientists involved. I won't say I'm sorry for the length of the comments. Some issues can't be reduced to sound bites or bullets. The chain of reasoning for these issues seems longer. Bruce B.