Re: blog: semantic dissonance in uniprot

Kingsley Idehen Mon, 23 Mar 2009 22:08:14 -0700

Michel_Dumontier wrote:

David,
  There's nothing like resurrecting this discussion one more time ;-)


For all representations one should make simplifying assumptions in order to 
increase the usability of the system.

In the life sciences, scientists don't care about database records - they care about the 
molecules and the biological processes for which facts have been collected about. It is 
an artifact of database KR that we have such records in the first place. You probably 
won't see "Record" in a bio-ontology.

IMHO, 303 redirects simply complicate matters and is not useful.

We track provenance with namespace and/or graphs. What could be simpler?

-=Michel=-


-----Original Message-----

From: David Booth [mailto:[email protected]]Sent: Monday, March 23, 2009 10:27 PM

To: Michel_Dumontier
Cc: W3C HCLSIG hcls
Subject: RE: blog: semantic dissonance in uniprot

Eric,

On Sat, 2009-03-21 at 13:49 -0400, Michel_Dumontier wrote:

Eric and friends,

 I’m very sympathetic to the simplifying assumption of not
distinguishing between a record and the molecular entity it
represents, but . . . .


I do not think this would be a wise "simplification".  This is only a
simplification from one perspective: because it avoids having to mint
and maintain pairs of URIs instead of a single URI.  But the downstream
cost is that it creates an ambiguity (or "URI collision")
http://www.w3.org/TR/webarch/#URI-collision
that may cause trouble and be difficult to untangle later as the data is
used in more and more ways.  For example, if any of the same predicates
need to be used on both the record and the molecular entity, they will
become hopelessly confused.  Also, if disjointness assertions are
included then this overloading may cause logical contraditions.

Cool URIs for the Semantic Web

http://www.w3.org/TR/coolurisdescribes best practices for minting URIs using 303 redirects to enable

the record to be obtained (indirectly) by following the URI for a
molecular entity.  If minting a separate URI for the molecular entity
seems onerous, it is trivial to use a 303-redirect service such as

http://thing-described-by.org/to do the job for you. And if you want to set up your own 303-redirect

service, that site will even show you the exact files that are used to
implement it:

http://thing-described-by.org/#What_This_Site_Does_

Provenance (who said what) is extremely important in scientific anaylsis
-- explicitly tracking the evidence leading to scientific assertions.
It is easy for me to envision applications that will both use assertions
about a molecular entity *and* assertions about the records that
describe those molecular entities.

If you are just minting disposable URIs that aren't intended to be very
reusable anyway, then this ambiguity is not a problem, and it may be the
quickest solution to your problem.  But if you want your URIs to be long
lived and used by others for other applications, I think it would be a
mistake.

David Booth

Michel,

303 redirection serves a single purpose: enforcement of the Identityprinciple for discrete data objects. If a datum lacks identity it cannotin away be resourceful.

The identity principle also implies that "Identity" stands alone fromall else, you cannot intermngle with "representation", for instance.

30X redirection is simply how you can implement Identity using HTTPbased Identifiers, meaning: a URI for a real-world data object (aka.resource) and a representation of its description are distinct. Thus, tohonor the aforementioned principles, an HTTP Server receiving an HTTPGET from a user agent that targets data object via its URI, mustre-route the request to an information resource URL that delivers adescription of the data object in question using a representation formatnegotiated by the client and/or server.

If you are going to honor the Identity principle on the Web, in anunobtrusive manner (i.e., leverage ubiquity of HTTP) there is no wayaround the above.

The whole essence of the Linked Data Web comes down to distillation ofData Objects from the host Information Resources (documents) i.e, makingthe Data Objects referencable and de-referencable via URIs, in the samemanner exhibited by their host / container documents since the beginningof hypertext. In short, think of this as Hyperdata linking added to thebroad concept of hyperlinking.

Scientist are always preoccupied with, and interested in, databaserecords because science lives and dies by the following processes:


1. Hypothesis
2. Observation
3. Conclusion

The steps above are about units of observation ("data"), contextualrepresentation ("information"), and conclusions ("knowledge").


In my experience, scientists are completely preoccupied with Data :-)

--


Regards,

Kingsley Idehen       Weblog: http://www.openlinksw.com/blog/~kidehen

President & CEOOpenLink Software Web: http://www.openlinksw.com

Re: blog: semantic dissonance in uniprot

Reply via email to