On 11/12/10 7:22 AM, Lars Heuer wrote:
Hi Kingsley,
[...]
If I want an RDF/XML representation of the document, I can ask for
Accept: application/rdf+xml
and Wikipedia would (ideally) return an RDF/XML representation of that
resource which tells me that John Lennon is a person who was born at
... murdered at ... was part of a group named ... etc.
Yes, so you received a document stating all of the above, who is the
Subject? How is the Subject Identified?
I don't understand the question. A person named "John Lennon" is the
subject. The subject is identified by the IRI.
If I issue a
GET<http://en.wikipedia.org/wiki/John_Lennon> Accept: application/x-tm+ctm
and the server responses with (using the Topic Maps syntax CTM since I
am not that familiar with RDF syntaxes):
<http://en.wikipedia.org/wiki/John_Lennon>
isa ex:person;
- "John Lennon";
born-at 1940-10-09;
died-at 1980-12-08;
member-of<http://en.wikipedia.org/wiki/The_Beatles>.
I'd know that the above mentioned IRI represents a NIR (a person)
which was born at .. died at .. etc.
Where is the problem with that approach?
[...]
Have to drop the fact that your non-web-sign-processor (DNA CPU)
already groks "John Lennon", and does a lot of fancy processing with
frames en route to disambiguation and context manifestation.
I don't understand that statement. A web agent would also know that
the IRI represents a person which has the name "John Lennon".
[...]
I see, DBpedia provides different IRIs. That's fine. But it's not
possible to keep<http://en.wikipedia.org/wiki/John_Lennon> (or
<http://dbpedia.org/resource/John_Lennon> if that matters) and make
statements about that, right? I cannot make statements which are
interpreted rightly without an Internet connection. I need the status
codes.
[...]
Personally, it can be solved at the application level by application
developers making a decision about the source of semantic fidelity i.e
HTTP or the Data itself.
Yes, it can be solved at application level. Maybe on a per domain
basis, but that's exactly the problem. Neither 303 nor 200 solves the
identity problem. Unless we'd introduce a concept to distinguish
between NIRs and IRs (like Topic Maps does with Subject Identifiers
and Subject Locators).
Topic Maps isn't doing anything that isn't being done via Linked Data
patterns, already. I've never groked this generally held position from
Topic Maps community.
An Identifier is an Identifier. It has a Referent.
A URI is an Identifier.
You can use an Identifier as Name or an Address.
Trouble is that HTTP is about document location and content
transmission. Thus, all URLs (Location Identifiers / Addresses)
ultimately resolve to Data. URIs in the generic sense don't, and you can
use an HTTP URI as a Name.
The 303 heuristic is how Name | Address disambiguation is handled re.
Linked Data.
A new option has emerged, which I think is pretty much what you outline
re. Topic Maps where, based on self-describing structured content (e.g.
RDF formatted data) transmitted from a URL, a slash terminated URL can
be treated as an HTTP URI based Name, by an application overriding the
conventional assumptions culled from HTTP responses i.e., 200 OK,
becomes Okay.
I'd tend to agree that 200 seems to be easier to handle than 303 (even
if it does not solve the identity problem either).
I don't see how it doesn't provide a solution to the Names | Address
disambiguation problem.
And fragment IRIs
do not solve that problem either. It's just a problem shift, imo.
Maybe an imperfect solution since disambiguation isn't handled by the
data itself.
[...]
Side note: Each subject/object needs a GET (assuming that predicates
are always NIRs) to interpret the statement correctly... Does it
scale? Let's assume you'd send me a DBpedia dump. I cannot interpret
it correctly, unless I have an Internet connection?
What about when I send you DBpedia in the post on a USB key ? :-)
I don't see how that statement contradicts my statement that I always
need an Internet connection.
If you send me DBpedia offline, I need an
Internet connection if I want to import the stuff and want to
interpret the triples correctly if a 200 / 303 status code is
necessary to handle the IRIs right.
You can work with DBpedia offline, assuming you install Virtuoso +
DBpedia data from a USB to your local drive. It will work absolutely fine.
The green pages are just browser pages, everything you do online you can
replicate offline, no problem at all re. DBpedia data. Of course if you
follow an out-bound link to a resource (descriptor document) outside the
DBpedia data set, you will need an internet connection if the data isn't
local.
Best regards,
Lars
--
Regards,
Kingsley Idehen
President& CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen