Re: Role of URI and HTTP in Linked Data

Nathan Wed, 10 Nov 2010 04:52:26 -0800

Hi Jiří,

Jiří Procházka wrote:

Hi,
having read all of the past week and still ongoing discussion about HTTP
status codes, URIs and most importantly their meaning from Linked Data
perspective, I want share my thoughts on this topic.


I don't mean to downplay anyone's work but I think the role of URI and
HTTP specifications (especially semantics) in Linked Data is
overemphasized, which unnecessarily complicates things.

The URI is what makes Linked Data, Linked Data, it's the only hook tothe real world, and via the domain name system + domain registrationprocess gives us a hook on accountability, which is criticallyimportant. "#bar, as described by <http://example.com/foo>" resolves intwo ways:

(1) <http://example.com/foo> as a name for the literal description/graph

(2) <http://example.com/foo> as a way of saying "the author of thedescription available at <http://example.com/foo>, stated X, and wasresponsible as delegated by the owners of example.com", where X is (1)and provable by the HTTP messages and logs. A status code of 200 vs 303to some other domain or URI vs 4xx or 5xx plays a big part in that chainof accountability / validity / trust.

Also never forget that Linked Data is just Links with literals, a Linkas in a hyperlink, its the description of a relationship between twothings (names or literals) which make a link a link, thus each link is astatement, statements form descriptions, descriptions are literalthings. Triples are statements, Graphs are descriptions.

There's a lot more to the simple triple with http URIs than manyrealise, sure it makes a nice RDF data bus for us and gives us an almostuniversal data format, which we can exploit and bring to the fore vialinked data, but that's just the tip of the iceberg, and ultimately ofvery little use without the URI and HTTP.


a few notes..

I think we can all agree, that the core idea of Linked Data is that
information is expressed using unique identifiers (URIs) I can simply
use to get useful information about the thing the identifier represents
(thus mandated relatively simple, widely supported transfer protocol HTTP).


as above, that's not the core of linked data, that's the surface.

So lets stick with this. Lets just treat URIs as RDF does - as simple
names. When we dereference an URI we get back some useful data and
that's it.


So, that'll be like mailto: or pop: or tel: then..

If we want to express, the data fetched are in fact a
document, we use the wdrs:isDefinedBy property. The data fetched are
just a data and any info about it should be contain in it.

Expressing that the data fetched is infact a document, is indeedoptional, but any response is always a message, a description, a/literal/ thing, you can't pretend it doesn't exist, it does - to say adescription is anything other than that is like me saying you're anapple and insisting everybody believe me. Literals are self identifying,self naming, things.

Why? Why no Content-Location? There is no reason to require additional
complexity, building extra information layers. Publishing the document
information in the data itself most probably would be simpler for both
the publishing and the consuming party. Treating HTTP as a simple
blackbox is what is mostly done in practice anyway.


Read only world then?

What if someone doesn't publish the document data? Would it mean the URI
we dereferenced refers both to the thing described and the description
of it? Kind of.

There is no kind of. The description is a literal thing all of it's own,it's the same thing regardless of media type or whether you write it ona bit of paper, it's a self identifying literal thing.

What I mean is the consumer side can add additional
information to the data about the document (when and how fast it was
fetched etc) and if the data doesn't contain info about the document
already, it could add it:
  <uri> wdrs:isDefinedBy [ wdsr:location "uri" ] . # or something like this
Non-RDF data should use their equivalents.
That is the most important things I had to say - lets keep semantics in
the data.

I believe it is quite important that the range of wdrs:isDefinedBy is a
document class, which should be domain of wdsr:location.


so one location / graph / description is a document, and the other isn't!?

I am going to explain why I think so, but beware, at this point I get a
bit philosophical :)

What is pretty awesome about RDF, which is something Linked Data could
learn, is how it dabbled the ontological (used as philosophical term)
issues - existence, being and reality. In order to support maximum
expressiveness and compatibility with various world-views it says the
least about it. Big part of that is dealing with identity - if a
caterpillar turns into butterfly, is it still the same thing? Am I still
I when I get older and change? RDF doesn't offer any answers to such
questions, neither if there are only information resources and other
resources. There are just names which identify objects or concepts,
which we describe with names and the final description matches some
number of objects or concepts we know, while the better the description
is, the lower the number is.

RDFS classes are used to describe various aspects of objects or
concepts, which allow us to express ourselves much less ambiguously,
using properties with defined domain and range. On the other hand we can
describe those aspects separately if we consider them a separate entity.
For example someone can say I am averagely skilled as an English
speaker, or that my English skill is mediocre, or that I am one of
averagely skilled English speakers. Similarly one could say <book> is
long 30000 characters as its content, or that <book> is long 20
characters as its title, or that <book> is long 3000 characters as the
description received on dereferencing. It shouldn't matter if I consider
a book name as part of it or not, if I use as unambiguously defined
properties as possible. However vocabularies with not very well defined
terms (consider an example "length" property), which generally mimic
natural language properties, are used widely, which is why we should
have wdrs:isDefinedBy.
The point of this philosophical exercise was to say, that shouldn't be
saying "an URI represents one resource" or trying to define what
resources are or what existence is, but recognizing the context of the
original information when modifying it (especially amending).

indeed, we should just realise that all we can do is describe things bymaking statements about them, and then provide a way to say how onedescribed thing relates to another.


Best,

Nathan

Re: Role of URI and HTTP in Linked Data

Reply via email to