On 3/29/07, Nicolas Le Novere <[EMAIL PROTECTED]> wrote: > On Thu, 29 Mar 2007, Matt wrote: > > > Can you explain in more detail or point to explanations of > > bqmodel:isDescribedBy? > > You can find some explanations at: > > http://www.ebi.ac.uk/compneur-srv/miriam-main/mdb?section=qualifiers
So there is no simple way to determine if this is a reference to a journal article except through interpreting the URI? > > Note tha qualifiers are optional to be MIRIAM-compliant. I personaly > think we should always use some qualification, otherwise an annotation > becomes very difficult to use except for jumping from webpage to > webpage. > > > Specifically: > > - what is its intended meaning? > > Cf above. Note that the list of qualifiers is by no mean frozen. We > are already aware of several gaps (e.g. how do-we qualify the relation > between a peptide and the gene that encodes it?) > > > - when more than one of these is defined on a resource, how is this > > interpreted? For example: is there some precedence implied somehow? > > This is up to the "tool" using the qualifiers. SBML does not allow > nested qualifications. There is only an implicit "hasVersion" if several > identical qualifiers are present: > > bqmodel:isDescribedBy toto > bqmodel:isDescribedBy tata > > means is described by toto and is described by tata. In other words > toto or tata describe the component. > > NOT toto and tata are necessary to describe the component. > > On top of that, BioModels DB add some precedence > http://www.ebi.ac.uk/compneur-srv/biomodels/doc/annotation.html > > But all that is not part of MIRIAM rules. > > > - how do you determine the kind of reference it is - for example a > > pubmed uri? You have a datatype for vocab/database IDs in the > > annotation scheme you described, but I don't see this in the > > bqmodel:isDescribedBy examples. > > <rdf:li rdf:resource="http://www.pubmed.gov/#8983160"/> > > http://www.pubmed.gov/ means "the following identifier has to be > interpreted as pointing to a data of PubMed". > > http://www.pubmed.gov/ is unique and should not normally > change. However, sometimes it may neverstheless change for various > reasons: URI too confusing, badly choose, fusion of two resources > etc. For instance, the old PubMed URI was > http://www.ncbi.nlm.nih.gov/PubMed/ > It was misleading because tied to a particular physical resource at > the NCBI. > > We have a deprecation system in place that allow to resolve the > old URIs and provide the new ones. > > > > - how would you address auxiliary references as opposed to primary > > references so that a machine interpreting it can make the distinction? > > I am not sure I understand that. Like primary and secondary accessions of > UniProt? For journal articles, or other publications, then being able to identify the primary reference(s) is useful. For database records, it would also be useful to label a group as being the most important (or defining) set, and others as 'helpful'. It was why I suggested that CellML bibliographic referencing seperated these two, and that the latter would need to be bound to a reason (a natural language comment would be fine) the described why that reference was made. > > > > > <snip> > >> > >> I entirely agree with Melanie, people should be able to pick the > >> resource they want, as far as they uniquely identify it. This is > >> clearly described in the MIRIAM paper. > > > > I'm not sure what benefits one gains from letting people arbitrarily > > choose what they want to use to identify something with. For example, > > how to you work out if particular entities in one SBML model match > > entities in another SBML model? > > > > Also, given that most of these resources are controlled vocabularies, > > there is a lot of room for misunderstanding someone's intention when > > using their choices of identifiers. > > > > > > > >> An annotation is formed of > >> three parts: > >> > >> The data-type, e.g. PubMed entry, DOI, GO term, Cell-type ontology term ... > >> > >> The identifier of the particular information, e.g. 123456789, GO:0001234 > >> ... > >> > >> An optional qualifier that describe the relationship between the concept > >> represented by the model component and the concept represented by the > >> particular information. > >> > >> To help people implement that, we developed MIRIAM resources > >> (http://www.ebi.ac.uk/compneur-srv/miriam/). > >> > >> If you download a model from BioModels DB in SBML (not in CellML at > >> the moment, for obvious reasons highlighted by the current > >> discussion), you will see something like: > >> > >> <bqmodel:isDescribedBy> > >> <rdf:Bag> > >> <rdf:li rdf:resource="http://www.pubmed.gov/#8983160"/> > >> </rdf:Bag> > >> </bqmodel:isDescribedBy> > >> > >> But on the webpage, there is: > >> > >> b>Publication ID:</b> <a > >> href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=8983160" > >> target="_blank">8983160</a> > >> > >> The URL is dynamically generated by MIRIAM webservices. I fact in the > >> new version of BioModels DB, to be released in the fall, the URL does > >> not point to PubMed anymore, but to the EBI extended Medline, more > >> comprehensive. BUT the URI stored in the model is still the SAME. > >> > >> Similarly for a DOI: > >> > >> <bqmodel:isDescribedBy> > >> <rdf:Bag> > >> <rdf:li rdf:resource="http://www.doi.org/#10.1063/1.1681288"/> > >> </rdf:Bag> > >> </bqmodel:isDescribedBy> > >> > >> is transformed in: > >> > >> b>Publication ID:</b> <a href="http://dx.doi.org/10.1063/1.1681288" > >> target="_blank">10.1063/1.1681288...</a> > >> > >> That system is very flexible. You can use any resource listed in > >> MIRIAM resources, and this resource can be extended at will (note that > >> we distribute XML version of the resource for local use). But it is > >> still robust and expressive. > >> > >> Cheers, > >> > >> On Wed, 28 Mar 2007, Melanie Nelson wrote: > >> > >>> Wow, I haven't posted to this list in a long time... > >>> But I feel compelled to give a little advice as > >>> someone who's spent a lot of time integrating > >>> biological information and therefore has made a lot of > >>> mistakes! > >>> > >>> By all means, have a best practice encouraging people > >>> to use the GO cellular_component ontology to describe > >>> organelles and cells. You could probably also use the > >>> molecular_function ontology for proteins (although > >>> this will be messier). However, neither is likely to > >>> be a complete, i.e., there will be models that > >>> reference a biological entity not in the GO > >>> ontologies. Also, there will be cases where the entity > >>> the model references is most properly thought of as > >>> related in some way (e.g., a subset, a superset, or a > >>> "sibling") to the GO entity. You can spend ages > >>> sorting this sort of thing out and coming up with > >>> consistent rules for handling all the relationships. > >>> > >>> > >>> Since you aren't really interested in sorting out this > >>> biological mess, you may want to consider letting > >>> people choose their own ontology and just reference > >>> it. > >>> An example of this practice is in the MIAME project: > >>> http://www.mged.org/Workgroups/MIAME/miame_1.1.html > >>> > >>> About the citations- my memory of this is fuzzy, but I > >>> think the original intent was that people should > >>> provide the PubMed ID where possible. However, not all > >>> journals are indexed in PubMed (for instance, there is > >>> a CellML paper published in one that is not), so the > >>> model needs to handle full citation info, too. The BQS > >>> model handles both, and then some, which is why we > >>> chose it. > >>> > >>> Hope this is helpful, > >>> Melanie > >>> > >>> > >>> --- Andrew Miller <[EMAIL PROTECTED]> wrote: > >>> > >>>> Matt wrote: > >>>>> I don't think this is a good idea. > >>>>> > >>>>> - I think bioentity should be depreciated, it has > >>>> not intrinsic semantic value. > >>>>> > >>>> It does, unfortunately, seem to usually target a > >>>> literal node at the > >>>> moment. It would be nice for this to at least be a > >>>> resource, which could > >>>> provide further information about the biological > >>>> entity (or if we decide > >>>> not to do that, at least a resource, with a > >>>> dictionary and a process for > >>>> adding new words to the dictionary to avoid > >>>> duplication). > >>>> > >>>> It seems that GO(Gene Ontology) has terms for cell > >>>> types, biological > >>>> compartments, and so on, which would offer a better > >>>> way to provide this > >>>> information. > >>>> > >>>> I still think that this metadata is useful, even if > >>>> the automated > >>>> interpretation of it is currently difficult. > >>>>> - If it is used currently, it should be left as > >>>> its current minimum > >>>>> specification which is to label and point to other > >>>> bioinformatics > >>>>> database IDs. > >>>>> > >>>> There are three layers of information here: > >>>> Layer 1: What biological entity are we describing? > >>>> (could be answered > >>>> with a GO term). > >>>> Layer 2: What information about that biological > >>>> entity are we using? > >>>> (could be answered with a reference to a paper, and > >>>> perhaps even a > >>>> reference to raw experimental data). > >>>> Layer 3: How was that information translated into a > >>>> model (could be > >>>> answered with a reference to a paper on the model). > >>>> > >>>> Layer 3 is clearly information about the model, and > >>>> should be described > >>>> by as an arc of the model resource. > >>>> Layer 1 is described by a literal at the moment. > >>>> > >>>> Layer 2 is therefore a gap, which we don't have any > >>>> proper way to > >>>> represent now. > >>>>> - The problem is not 'biologically related > >>>> paper's' per se, but one of > >>>>> identifying what was the primary publication or > >>>> publications that > >>>>> motivated a model. > >>>>> > >>>> The publication which motivated the expression of a > >>>> model in CellML, or > >>>> the publication which motivated the creation of the > >>>> model? Most of the > >>>> models in the repository were motivated by a paper > >>>> about a model which > >>>> was not initially expressed in CellML. However, the > >>>> way that the > >>>> metadata specification works now is that the paper > >>>> which describes the > >>>> model (not the paper which motivated it) is > >>>> referenced from the > >>>> information about the model (not information about > >>>> the CellML file). > >>>>> - There is also the case where a single > >>>> publication that contains a > >>>>> mathematical model is the one and only primary > >>>> source for the model > >>>>> itself - a rather common case at the moment. > >>>>> > >>>> This is what most models in CellML should aim to > >>>> attain. Models can be > >>>> submitted prior to publication as a model, but the > >>>> step of going from > >>>> the biology to a model is something which does need > >>>> peer review. > >>>>> I would prefer that the primary publication(s) be > >>>> identified as such, > >>>>> which covers the case in where there are some > >>>> models in the repository > >>>>> built from general review papers of biology with > >>>> no math. > >>>>> > >>>> If a model is built in that way, it should reference > >>>> the review papers > >>>> as information about the biology, and the author > >>>> should ideally submit > >>>> it for publication, at which point the reference to > >>>> the paper could be > >>>> filled in. > >>>>> I would prefer references to other related > >>>> publications to be bound > >>>>> explicitly to a comment in the model metadata - > >>>> there should be a > >>>>> reason identified by the author/editor/reviewer as > >>>> to why there has > >>>>> been such an association made. > >>>>> > >>>> The problem with this is that the comment is not > >>>> machine readable, so > >>>> there is then no way to get aggregate statistics on > >>>> why models are > >>>> linked. There is also a potential for significant > >>>> duplication of > >>>> information, as opposed to a set of standardised > >>>> predicate terms for > >>>> linking to a set of models. > >>>>> As an aside, we also need to determine whether the > >>>> bqs schema provides > >>>>> enough detail to match publications across > >>>> metadata instances for > >>>>> different models, and whether we should be > >>>> complimenting bibliographic > >>>>> data with pubmed Ids and the like. > >>>>> > >>>> I think that the PUBMED ID is always useful, because > >>>> it allows CellML > >>>> processing software (e.g. the repository) to link > >>>> directly to the Entrez > >>>> / PUBMED page. We could build links based on > >>>> searches for authors and > >>>> titles, but a unique ID is much cleaner. It seems > >>>> that many repository > >>>> models do have PUBMED IDs on them. > >>>> > >>>> Best regards, > >>>> Andrew > >>>> > >>>> _______________________________________________ > >>>> cellml-discussion mailing list > >>>> [email protected] > >>>> > >>> http://www.cellml.org/mailman/listinfo/cellml-discussion > >>>> > >>> > >>> > >>> > >>> > >>> ____________________________________________________________________________________ > >>> Bored stiff? Loosen up... > >>> Download and play hundreds of games for free on Yahoo! Games. > >>> http://games.yahoo.com/games/front > >>> _______________________________________________ > >>> cellml-discussion mailing list > >>> [email protected] > >>> http://www.cellml.org/mailman/listinfo/cellml-discussion > >>> > >> > >> -- > >> Nicolas LE NOVERE, Computational Neurobiology, > >> EMBL-EBI, Wellcome-Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK > >> Tel: +44(0)1223494521, Fax: +44(0)1223494468, Mob: +44(0)7833147074 > >> http://www.ebi.ac.uk/~lenov, AIM: nlenovere, MSN: [EMAIL PROTECTED] > >> _______________________________________________ > >> cellml-discussion mailing list > >> [email protected] > >> http://www.cellml.org/mailman/listinfo/cellml-discussion > >> > > _______________________________________________ > > cellml-discussion mailing list > > [email protected] > > http://www.cellml.org/mailman/listinfo/cellml-discussion > > > > -- > Nicolas LE NOVERE, Computational Neurobiology, > EMBL-EBI, Wellcome-Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK > Tel: +44(0)1223494521, Fax: +44(0)1223494468, Mob: +44(0)7833147074 > http://www.ebi.ac.uk/~lenov, AIM: nlenovere, MSN: [EMAIL PROTECTED] > _______________________________________________ > cellml-discussion mailing list > [email protected] > http://www.cellml.org/mailman/listinfo/cellml-discussion > _______________________________________________ cellml-discussion mailing list [email protected] http://www.cellml.org/mailman/listinfo/cellml-discussion
