Re: My task from last week: Semantic free identifiers

Andrea Splendiani Mon, 20 Jun 2011 16:55:37 -0700

Well...
I don't think people is coding an information system by hand. Perhaps having
a look at files with just and editor happens more frequently, and I don't
think it can be avoided.
Perhaps I'm getting the feeling that the different points of view reflects a
different perception of OWL vs RDF ?
Just a thought. I'm much more likely to move triples via scripts than using
a tool.


ciao,
Andrea

Il giorno 20/giu/2011, alle ore 23.44, Michel_Dumontier ha scritto:

> IMHO, if you're still coding the content of an information system by hand,
then you're going to introduce errors. A database curator should never
assign their own identifier - this is internal to the technology and the
information system. If you're a programmer, you should query the resource
(ontology) for the identifiers based on the labels.  Be more sophisticated.
Do it right. Build useable APIs/UIs for people.
>
> Best,
>
> m.
>
>
>
>> -----Original Message-----
>> From: [email protected] [mailto:public-semweb-lifesci-
>> [email protected]] On Behalf Of Sivaram Arabandi, MD
>> Sent: Monday, June 20, 2011 6:14 PM
>> To: M. Scott Marshall
>> Cc: Chime Ogbuji; Andrea Splendiani; [email protected]; James
>> Malone; HCLS; Jonathan Rees
>> Subject: Re: My task from last week: Semantic free identifiers
>>
>> Consider the following:
>>
>> 1. Readability - the former is far more readable than the later:
>>       RO:part_of
>>              vs.
>>      <http://purl.obolibrary.org/obo/RO_0000001>
>>
>>    - this becomes even more apparent in a triple (CO = a 'Cardiology
>> Ontology'):
>>      CO:Mitral_valve   RO:part_of   CO:Heart
>>              vs.
>>      CO_01234556   RO_0000001   CO_01234554
>>              - doesn't make much sense (without tool support, which is
>> 'practically' non-existent).
>>
>> 2.  Mistakes are extremely difficult to spot with opaque identifiers:
>>      CO_01234556   RO_0000001   CO_01224554
>>              vs.
>>      CO:Mitral_valve   RO:part_of   CO:Brain
>>              - this is an obviously false statement - but not easy to spot
>> if opaque identifiers were used.
>>
>>      This leads to a very insidious problem, one that is difficult to
>> detect.
>>
>> 3. I am not sure why the following is an issue:
>>      " Is my http://experiment the same as yours?
>>        Is my http://gene? http://study?
>>        Does my gene http://leads_to disease make sense?"
>>
>>      - Obviously if I use "http://experiment"; and you use
>> "http://experiment"; we both are referring to the same thing.
>>      - But instead if I use "http://medicine/experiment";  and you use
>> "http://biology/experiment";, we 'may' not be referring to the same thing.
>>
>> 4. When using readable identifiers, it is difficult to make changes to an
>> existing term (Class) - I think this is a strength as opposed to an
issue.
>> It raises the bar and should encourage authors (of models) to create
terms
>> thoughtfully after due diligence. And when there is a real need to change
>> the term i.e. its meaning has changed or it was inappropriate, ontology
>> patterns can be used to retire the term (if necessary, labelled as
>> deprecated) or reposition it.
>>      - 'Typos' in term names is definitely not a reason for having opaque
>> identifiers. Avoid them by having a good process for introducing terms.
If
>> and when they occur, use ontology patterns to deal with them.
>>      - Using opaque identifiers with labels makes it very easy, almost too
>> easy, for the labels to be changed. Often times users of a model may not
be
>> aware of such changes.
>>
>>
>> --Sivaram
>>
>>
>>
>> On Jun 20, 2011, at 4:15 PM, M. Scott Marshall wrote:
>>
>>> Hi Chime,
>>>
>>> The main reason is that when semantics and natural language are
>>> inserted into identifiers, some identifers are doomed to become stale
>>> as thinking evolves or changes about the semantic representation. Or
>>> when a new 'name brand' is created for that namespace: I think that
>>> the best example of this was provided by Jonathan Rees for Shared
>>> Names - ever heard of 'locuslink' identifiers? I believe that Entrez
>>> Gene occupies the name branding of that space now.This is precisely
>>> the sort of problem that Shared Names would like to avoid by serving
>>> (non-ontological) identifiers from a 'neutral namespace'. In
>>> ontologies, the same principle applies (I see that Helena has supplied
>>> a good example).
>>>
>>> I agree with Mark about proper tooling - the tools should
>>> automatically display labels. It's true that I don't know of a SPARQL
>>> editor that does this to a satisfying degree yet, (except for one:
>>> SPARQL Assist Lanugage-Neutral Query Composer from McCarty et al,
>>> shown at SWAT4LS in Berlin :) See Mark's post.) but that is not a
>>> reason to create identifiers and your knowledge representation in a
>>> way that won't stand the test of time.
>>>
>>> Shouldn't we consider RDF to be the bytecode of knowledge? Although I
>>> understand the difficulty of dealing with non-human readable
>>> identifiers in SPARQL and RDF, I believe that we are now looking at
>>> bytecode and complaining that it isn't human readable. It's true that,
>>> until the tools are available, it is difficult to write SPARQL
>>> queries. But if we applied the same logic to gene accession numbers,
>>> where would we be now? The SPARQL queries will eventually be 'under
>>> the hood', supplying labels to a GUI near you. :)
>>>
>>> Cheers,
>>> Scott
>>>
>>> On Mon, Jun 20, 2011 at 9:34 PM, Chime Ogbuji <[email protected]>
wrote:
>>>> On Monday, June 20, 2011 at 3:08 PM, Andrea Splendiani wrote:
>>>>
>>>> Hi,
>>>> sorry to jump on this thread like this...
>>>>
>>>> To be honest, I'm kind of concerned by the insistence on
semantic-opaque
>>>> identifiers.
>>>>
>>>> I am as well and I have been for some time.
>>>>
>>>> I understand the reason for them,
>>>>
>>>> Actually, I would be interested in hearing the reason for them
>> enumerated,
>>>> because I have had a hard time imagining what could possibly offset the
>>>> (significant) impact on readability that it has on biomedical
>> ontologies.
>>>> The barrier is already high for non-logicians and non-semantic web
>>>> aficionados to use biomedical ontologies.  Why set it any higher?
>>>> -- Chime
>>>>
>>>
>>>
>>>
>>> --
>>> M. Scott Marshall, W3C HCLS IG co-chair, http://www.w3.org/blog/hcls
>>> http://staff.science.uva.nl/~marshall
>>>
>>
>>
>>
>> -----
>> No virus found in this message.
>> Checked by AVG - www.avg.com
>> Version: 10.0.1382 / Virus Database: 1513/3716 - Release Date: 06/20/11
>>
>> -----
>> No virus found in this message.
>> Checked by AVG - www.avg.com
>> Version: 10.0.1382 / Virus Database: 1513/3716 - Release Date: 06/20/11

Andrea Splendiani
Senior Bioinformatics Scientist
Centre for Mathematical and Computational Biology
+44(0)1582 763133 ext 2004
[email protected]

Re: My task from last week: Semantic free identifiers

Reply via email to