Hi, Thanks for posting the T2 REST outline. I have some thoughts which I hope are useful. Sorry to make it so long, but I'm begining to despair about workflow tooling for REST consumers. Though I am working towards using Taverna, I'm not experienced with Taverna. With that in mind I hope you find the suggestions useful and the argument compelling.
Overall I can't see much in the current proposal that is worth following - sorry to be so brutal. Much more could be gained by providing a capability for T2 to run script languages such as Ruby which has several well developed and battle tested frameworks, each aimed at making it easy to consuming HTTP (and other protocol) services, RESTful or not. One inescapable issue is what is meant by a RESTful service. Opinions range: From, anything over HTTP (and only HTTP). To, only what Roy Fielding has described (inventor's prerogative). Reader's Digest version of what follows: - Offering a Ruby (my preference), Python, etc. 'service' will get you 90% towards accommodating additional RESTful (and not RESTful) services. Is this currently possible or on a Roadmap? - OpenID authentication and OAuth authentication is important [3]. - Generic input and output ports based on script variables for passing data to and from any Ruby, Python etc scripts using 'convention-over-configuration'. - A special case is providing URI input and output ports. That is, protocol agnostic URI specific input and output ports, e.g. populate an input port's URI template with data from other input ports (iterating over arrays, etc). On Fri, Apr 9, 2010 at 1:59 AM, Alan Williams <[email protected]> wrote: > Hope you find this useful, > > Alan > > Hi! > > My colleague Stuart Owen has been specifying how we can build a > lightweight support for REST services in Taverna [1] . Lightweight is good, I do think it is possible for less T2 HTTP specifics to facilitate the development of more RESTful services. Having said that, I didn't find the specification in [4] sensible, nor the example at [5] compelling. Both do reflect a view I had until I wrestled with [1]. I think Fielding addresses the issues. > We've scheduled > the first iteration of implementing this to be made available for > Taverna 2.2 this spring. > > > As some of you know, you can access most REST services already using > the 'Get page from URL' local worker, but this would mean half your > workflow is spent constructing an URL. Why would anyone do that? Could they not just write some Beanshell script (javascipt), using ExtJS? Or R-project script using Rcurl or HTTPRequest? And if they are code averse why are they going to be interested in a Get service? Just document that putting this URI in here and output ports named, x,y,z will contain the data 1,2,3 - they don't need to know that in v1 it came from a single GET, in v1.3 three GETS and in v2 one HTTP, four FTP and one SMTP request! > Also this worker can only use > the HTTP GET method, and with no way to specify other parameters such > as the 'Accepts' header. I _really_ think there is much to be gained by abstracting these sorts of protocol specific concerns away from the RESTful service component. There are many well designed and battle tested frameworks available for accessing RESTful services. These frameworks are in languages like Ruby, Python, perl, etc. T2 would benefit by throwing open some doors to those languages in the same way it opens the door to javascript and R. IMO it is much more valuable to provide some way of leveraging all that prior art/experience. This is not to say there is nothing REST specific to do. For example, user friendly (GUI) provision of OpenID/OAuth support, 'automagically' consuming URI template input ports given other input port data, robust and efficient mapping between input/output ports and script variables. > > He has worked on this proposal together with several local users, and > we had a meeting where we discussed this approach. There are of course > plenty of more possibilities, but the goal of this planned work is to > build something that should make it just a little bit easier to do > REST services in Taverna. I do like the URI parameter-inputs/port-name mapping idea, it seems natural, and even appears to satisfy a Fielding rule! ;) In case it is implied, I would not force input ports to match URI template parameters at the outset. A key to RESTful design is to decouple the client and the server, making the input-port data available during the life of a REST client script gives the provider flexibility to consume that data at different points in time and so change, at a later date, the order in which data is 'discovered' via some hypertext driven pathway. I think the 4-protocol-specific-services (or even N services) is a bad idea. The user should not have to know if the URI they provide is first consumed with a POST or GET, and the script provider should be free to change this by providing an updated client script. I'd really prefer an approach where T2 provides ways for Ruby, Python etc script authors to run their client code, and those authors are free to choose to use/abuse the 'RESTful' definition, and how their client code interacts with their service. I think having T2 worry about headers and response codes, etc. constrains service providers (e.g to HTTP) - it seems to me the issue is not whether T2 is happy with the data or protocol, but whether the script processing the response knowns what to do with it (given its knowledge of media types, etc. See Fielding's post [1] ) > > Part of the challenge is that most REST services are not formally > described, Roy might disagree [1] - any REST service is formally described. Or at least Roy Fielding seems to think there is a formal description of any and all RESTful services - if you don't match the definition/rules he asks us to call it something else (WADL?). I think you meant to say that many services claim to be RESTful, when they are not. Consequently, there are a plethora of 'interpretations' claiming to be RESTful. In my experience the problem is services describing themselves as RESTful when the inventor indicates they clearly are not. This situation is unavoidable while REST knowledge and experience spreads, and is vigorously debated, among developers. My point is this: T2's 'challenge' is to accommodate services that claim to be RESTful (but are not) without shutting out services that are or, at least, are striving to be RESTful. Frankly, I really think T2 would benefit by sidestepping this whole issue and providing sripting language components that are convenient and generic - let the different service providers defend their choices and their work. > even though there are several proposed standards for doing > so (WADL, WSDL2). This solution will therefore require the user to > specify URL pattern for the service themselves. However, we're also > allowing for future linking with BioCatalogue, which can have richer > descriptions of REST services. > > (See for example http://sandbox.biocatalogue.org/rest_methods/41 of > such descriptions which we in the future could show from the > 'Available services' panel, and show help about using the BioCatalogue > plugin. ) Okay, this is a case in point. Ignore Fielding's definition/rules for the moment. It would be useful if T2 allowed for a service that took a different approach. It may or may not be a REST service, may or may not use HTTP, so could be very different to this service. > > One of the other challenges with REST support is how to deal with the > output data, for instance using XPath for XML output, or a similar > JSONPath for JSON output. Again, I'd avoid even touching prescriptions on this topic. Can't the REST service script provider be left to consume this data themselves, per the Beanshell and RShell service conventions for port/variable name mapping? Of course the REST service script provider is free to document that an output port named 'chew_this' will contain an XML document from hell. > > > Feel free to have a look at Stuart's specification [1] and provide us > with any feedback. We're also interested in example services that you > envision using from Taverna. > > > [1] http://www.mygrid.org.uk/dev/wiki/display/developer/T2+REST+Support > Currently I'm trying to move my understanding closer to Fielding's definitions/descriptions - he is bright, and has thought about this subject. So far, when I've understood the practical effects of what he's described, there has been substantial benefits - YMMV. A definition which has influenced my thinking is [1]. It is succinct enough for me to periodically evaluate what I'm currently doing or thinking. It took me a long time to wrap my mind around the _practical_ implications behind Roy's description/rules. I'm certainly not a guru so bear in mind that my understanding is evolving. Anyway, as I have tried to force my ideas to conform to what Roy describes, a direct (beneficial) consequence has been to eliminate prior design decisions that I now realize were poor, or forced me into situations where there was no good solution open to me. When that improvement process stops I'll probably stop moving my implementation/architecture towards his definitions. It will be exciting to have Taverna's REST Service allow the full range of what people think RESTful behavior/architecture is, at least this would allow legacy service ideas (such as mine) to evolve to be RESTful. In fact I think it'd be good to explicitly decline offering a Taverna view of how a RESTful service behaves - or what is RESTful - beyond some minimal things, like starting with a URI (i.e. HTTP optional), and settling on a defined template format, see [2]. I think great flexibility (hence functionality) could be achieved with few conveniences provided by a Taverna RESTful service GUI component: A) OpenID authentication and OAuth authorization, two and three legged [3]. B) A _single_ URI entry point: The URI could be a parameterized template (For example see [2]), where the value from any input port name matching the template field is substituted into the template. Or the URI could be complete. It would be useful to have a couple of behaviors for populating a template (synchronized, left-to-right, right-to-left) by iterating over arrays data in other input ports. C) Delegate loading/running RESTful client's script code (Jython/JRuby, etc), to a minimal script. Ideally the full power of these languages woud be exposed, allowing all their pre-existing and emerging frameworks to be used. T2 could easily say: "The T2 convention is to make the data stored in script variables available to output port's whose port name matches the variable name. RESTful API service provided scripts are responsible for consuming their responses and making them available via variable/port names they document for end users." I think this is how the Beanshell and Rshell services work? To avoid difficulties tracking variable when their count explodes there could be a prefix (t2_out_) naming convention, e.g. (Ruby) @t2_out_yellow. This way it is left to the RESTful script author to consume their output formats, they eat their own dog food rather than force everyone else to consume JSON, XML, etc. This would of course still allow a RESTful provider to dump a whole HTTP response, or XML document, in a output port and have another T2 component consume it... horses for courses. This may be going off topic... Providing a T2 RESTful service component that fits this (RESTful) service description would be much easier, and 90% complete, if Taverna was just able to run Ruby/Python, etc. scripts. In the tradition of "write the code you wish you had". I have in mind the following (Ruby) script: #-------- start Ruby script ----------- require 'my-restful-lib' require 'my-restful-lib/t2' puts "This is what that T2 passed in via the env: #{ENV['T2_URI_INPUT']}" puts "This is my oauth key from the T2 GUI component: #{ENV['T2_OAUTH_KEY']}" # Do something with ENV['T2_INPUT_PORT_NAMES'] resp = MyRestfulLib::T2.grab ENV['T2_URI_INPUT'], :oauth_key => ENV['T2_OAUTH_KEY'] # Do something with resp # Export value to T2 RESTful service output port names.... # How - using system ENV, other? if resp.code == 200 & resp.header[:content_length] < 20 then @t2_interesting_data = resp.body.wow else @t2_interesting_data = 'boring' end puts "#...@t2_interesting_data}" #--------- end Ruby script ------------ Of course I have just used 'puts' and ENV. It is not clear to me what the best way is to pass data between T2 and scripting frameworks such as Ruby/Python (JRuby/Jython)... The advantages of this approach are (per Fielding's rules): - The T2 RESTful service (T2 service) does not force providers adopt any single communication protocol (the result formating might resemble HTTP content, but the API client code could have obtained that data using FTP, etc.) - T2 service providers have a free hand with their media types - T2 service providers have a free hand with URI construction within their scripts - T2 service providers have a free hand following hypertext within their scripts (HATEOS) - The T2 service does not require definition of fixed resource names or hierarchies - The T2 service does not require any “typed” resources be significant to the client's RESTful API library - The T2 service can initialize any RESTful API with no prior knowledge beyond the initial URI (i.e. populated template) The main work/issues might be: - Implementing JRuby, Jython, etc. script support (bringing additional benefits beyond REST services) - Settling on conventions for passing data between these scripts and the T2 RESTful service ports - OAuth Two significant advantages of this proposal are: I) Apart from populating an input URI with values from matching input port names, this 'service' would really just be a Ruby/Python (JRuby/Jython) 'service', so there would likely be much greater bang inaddition to having access to existing RESTful frameworks, e.g. Ruby's EventMachine or Python's Twisted. II) The detail of any RESTful API contents and behavior are abstracted away from T2, and those headaches are left were they belong - with the RESTful API service providers. I deliberately have not emphasized WADL, etc. More than a year ago I tried to implement a simple RESTful process using WADL - it didn't work. At the time I wasn't as familiar with Roy's 'rules', maybe WADL is now REST capable? I do know that there are some who swear _by_ WADL and others who just swear _at_ WADL :) In what I have suggested, those issues are left to the writer of the RESTful library (MyRestLib::T2 in the example above). Mischief follows: In fact in the above proposal it is _possible_ that no data is ever obtained across the network, nor HTTP ever used, and the 'response' is just manufactured in-memory ;) Thanks for the opportunity to provide feedback, and I appreciate any comments. Regards Mark [1] http://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypertext-driven [2] http://tools.ietf.org/html/draft-zyp-json-schema-02#section-6.1.1 [3] http://oauth.net [4] http://www.mygrid.org.uk/dev/wiki/display/developer/T2+REST+Support [5] http://sandbox.biocatalogue.org/rest_methods/41 > -- > Stian Soiland-Reyes, myGrid team > School of Computer Science > The University of Manchester > > ------------------------------------------------------------------------------ > Download Intel® Parallel Studio Eval > Try the new software tools for yourself. Speed compiling, find bugs > proactively, and fine-tune applications for parallel performance. > See why Intel Parallel Studio got high marks during beta. > http://p.sf.net/sfu/intel-sw-dev > _______________________________________________ > taverna-hackers mailing list > [email protected] > Web site: http://www.taverna.org.uk > Mailing lists: http://www.taverna.org.uk/about/contact-us/ > Developers Guide: http://www.taverna.org.uk/developers/ > > ------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ taverna-hackers mailing list [email protected] Web site: http://www.taverna.org.uk Mailing lists: http://www.taverna.org.uk/about/contact-us/ Developers Guide: http://www.taverna.org.uk/developers/
