Hi Damian et al,

two brief remarks:

1. The problem is not just changing the names of things (which in
principle could be dealt with automatically). But also the content may
change (e.g. as new data comes in or the science advances), and any
workflow that makes any sort of assumptions on the content of a database
rather than just the datatypes and names can be broken by this. Hence
versioning of a webservice is IMHO the only sane approach - application
developers can develop their workflow against a particular version of a
webservice, and it will always run. When _they_ update, they know that
things might change and that they should do _their_ semantic checks.

2. The Bioconductor project has already come up with a solution to this
(before, and alternative to biomaRt): it also has its warts, but it
_does_ provide a stable source of genomic data against which workflows
can be programmed and don't break underhand: Metadata packages. They are
versioned, self-documenting lumps of metadata that can be downloaeded,
installed in R, and used as long as needed. They are updated with every
Bioconductor release, user can (but don't have to) update, at which
point they also need to check whether their scripts that make
assumptions on content still work.

http://www.bioconductor.org/packages/release/data/annotation/

 Best wishes
 Wolfgang



>>> What is the usual approach taken in webservices to keep track of name 
>>> changes in external software? Presumably just hoping people don't change 
>>> them is not the approach being taken as that is doomed to failure. Is 
>>> there some sort of format to represent the old->new name mappings that 
>>> people typically use?
>> There isn't a way of coping with it, if the names / schema / signature 
>> of methods etc change you're screwed as a client. There's no reason I 
>> can see for the internal names to change anyway, the user visible names 
>> haven't changed so in this case it seems entirely unwarranted, who does 
>> it help?
> 
> it was to help people using the API directly in scripts or through their 
> own Query XML rather than a nice GUI like Taverna. These all use the 
> internalNames of attributes and filters so it helps if they match the 
> displayNames (with underscores instead of spaces etc)
> 
>> I know it's not a property of biomart as such, it's a property of the 
>> way biomart is used but it's still an important point - we now have 
>> clients which can store queries, those stored queries should be made as 
>> stable as possible. There is nothing we can do about this, if we have a 
>> query stored and we try to apply it to the data set after the metadata 
>> has changed the query will be broken, end of story.
>>
> 
> hmmm - seems like there should be a way of webservices deployers giving 
> mappings between versions to support this type of thing. I can't really 
> see how any sort of storage of queries is going to work, especially in teh 
> biological community :) We find it hard to maintain stability just within 
> biomart, if you have a pipeline involving several external resources would 
> have thought things get unmanagable
> 
> cheers
> Damian
> 
> 
>> Cheers,
>>
>> Tom
>>


-- 
------------------------------------------------------------------
Wolfgang Huber  EBI/EMBL  Cambridge UK  http://www.ebi.ac.uk/huber

Reply via email to