On Mon, 24 Jul 2006, Wolfgang Huber wrote:

> Hi Damian et al,
> 
> two brief remarks:
> 
> 1. The problem is not just changing the names of things (which in
> principle could be dealt with automatically). But also the content may
> change (e.g. as new data comes in or the science advances), and any
> workflow that makes any sort of assumptions on the content of a database
> rather than just the datatypes and names can be broken by this. Hence
> versioning of a webservice is IMHO the only sane approach - application
> developers can develop their workflow against a particular version of a
> webservice, and it will always run. When _they_ update, they know that
> things might change and that they should do _their_ semantic checks.

versioning does seem to be the easiest and best way to go. Ensembl and 
hence the marts built from it are already versioned so any names in 
datasets from say the last release 39 should be guaranteed to not change. 
If you try and run your same query against release 38 then I guess 
hopefully it will work but not guaranteed. We just need to flesh out this 
out in our services interface - the logistics are already there

best wishes
Damian


> 
> 2. The Bioconductor project has already come up with a solution to this
> (before, and alternative to biomaRt): it also has its warts, but it
> _does_ provide a stable source of genomic data against which workflows
> can be programmed and don't break underhand: Metadata packages. They are
> versioned, self-documenting lumps of metadata that can be downloaeded,
> installed in R, and used as long as needed. They are updated with every
> Bioconductor release, user can (but don't have to) update, at which
> point they also need to check whether their scripts that make
> assumptions on content still work.
> 
> http://www.bioconductor.org/packages/release/data/annotation/
> 
>  Best wishes
>  Wolfgang
> 
> 
> 
> >>> What is the usual approach taken in webservices to keep track of name 
> >>> changes in external software? Presumably just hoping people don't change 
> >>> them is not the approach being taken as that is doomed to failure. Is 
> >>> there some sort of format to represent the old->new name mappings that 
> >>> people typically use?
> >> There isn't a way of coping with it, if the names / schema / signature 
> >> of methods etc change you're screwed as a client. There's no reason I 
> >> can see for the internal names to change anyway, the user visible names 
> >> haven't changed so in this case it seems entirely unwarranted, who does 
> >> it help?
> > 
> > it was to help people using the API directly in scripts or through their 
> > own Query XML rather than a nice GUI like Taverna. These all use the 
> > internalNames of attributes and filters so it helps if they match the 
> > displayNames (with underscores instead of spaces etc)
> > 
> >> I know it's not a property of biomart as such, it's a property of the 
> >> way biomart is used but it's still an important point - we now have 
> >> clients which can store queries, those stored queries should be made as 
> >> stable as possible. There is nothing we can do about this, if we have a 
> >> query stored and we try to apply it to the data set after the metadata 
> >> has changed the query will be broken, end of story.
> >>
> > 
> > hmmm - seems like there should be a way of webservices deployers giving 
> > mappings between versions to support this type of thing. I can't really 
> > see how any sort of storage of queries is going to work, especially in teh 
> > biological community :) We find it hard to maintain stability just within 
> > biomart, if you have a pipeline involving several external resources would 
> > have thought things get unmanagable
> > 
> > cheers
> > Damian
> > 
> > 
> >> Cheers,
> >>
> >> Tom
> >>
> 
> 
> -- 
> ------------------------------------------------------------------
> Wolfgang Huber  EBI/EMBL  Cambridge UK  http://www.ebi.ac.uk/huber
> 
> 

Reply via email to