On 6/19/11 12:05 PM, Hugh Glaser wrote:
"A step too far"?
Hi.
I've sort of been waiting for someone to say:
"I have a system that consumes RDF from the world out there (eg dbpedia), and it
would break and be unfixable if the sources didn't do 303 or #."
Plenty of people saying they can't express what they want without it.
And plenty of people saying they can't write some code that they might not be
able to understand some RDF they receive properly.
But no actual examples in the wild (at least as far as I can tell in a lot of
messages).
This might be for quite a few reasons, such as:
1) There are no such consuming systems;
2) The existing consuming systems would not break.
Number (1) would be too embarrassing, and is wrong because I have some, so I'll
think about number (2).
There seem to be some axes in the discussion:
publish / consume
long/medium term / shorter term
ideal / pragmatic
Interestingly, we don't seem to have a strong theory / practice axis, which is
great.
As a publisher, I/we have had to work pretty hard to conform to really quite
complex requirements for publishing RDF as Linked Data; not just Range-14, but
voiD, sitemaps and various bits and pieces that Kingsley always tells me to do
in the RDF.
As a consumer, it has been pretty simple: "Well guv, thanks for the URI, here's some
RDF."
It has always been something of a source of angst (if not actual pain) to me
that none of the extra work I put into publishing RDF is ever used by me or
anyone else, as far as I know.
Er. we use it :-)
The problem with this whole Linked Data thing is that its truly Ninja tech.
The killer conductor of value is the LINK. This lethal weapon applies to
all dimensions of the Web:
1. Information Space
2. Data Space
3. Knowledge Space.
Trouble is, where do we find strong anecdotes for a cross dimensional
lethal weapon? I try to use Stars Wars and the FORCE at times, but even
that doesn't quite nail what we are dealing with here. Thus, we could
take another approach i.e., embrace and extend what we know is anomalous
since the AWWW architecture (FORCE) actually lets us do this anyway.
In fact, some of the sites I consume actually don't do things "properly" - I
might have had to change my consuming systems to cope with this, but I don't, because
they already cope fine.
Exactly! You are using the FORCE :-)
Why is it not a problem? One obvious reason is that the consuming application
is actually looking for specific knowledge about things.
I don't have a consuming system that is considering both lexical and animal
subjects, and so confusion does not arise.
You have a Data Space dimension app. The Information Space dimension
doesn't interfere with your world view. This is key in many ways. For
instance, imagine if your app was of the Information Space dimension
instead, the effect would be very close to what we see today re. those
that see Name and Address disambiguation as impractical overkill since
nothing breaks in the world they experience.
In fact, it is the predicates that tend to distinguish satisfactorily for me
(as has been pointed out by some people).
Yep! The Data Space realm lets you Describe anything with clarity, and
even when unclear, agents can ultimately agree to disagree without
obliteration.
Thus, if I get a triple that says the URI that would resolve to my Facebook
page foaf:knows the URI that would resolve to your Facebook page, I (my system)
will happily interpret that as one person (or whatever) foaf:knows the other. I
certainly don't want to go and resolve these to find out to what the URIs
actually resolve. And if I did, what would I do about it? Ignore it?
As you would in code generally, encounter an exception, and decide if
you avoid making it a critical fault :-)
In fact, as has also been mentioned, you can define domains, ranges and
restrictions for as long as you like, but it is quite possible and likely that
the users of URIs will continue blissfully unaware of any of this, in exactly
the same way that they continue unaware that there might be something ambiguous
about the URIs they are using.
Yes, when they operate in the Information Space dimension.
By the way, as is well-known I think, a lot of people use and therefore must be
happy with URIs that are not Range-14 compliant, such as
http://www.w3.org/2000/01/rdf-schema .
In the Information Space dimension, yes. In that dimension it doesn't
matter.
When we help people publish, it really is tough to engage them long enough to
care about the complex issues, and they often get it wrong - I am engaged with
quite a few people who are now publishing serious amounts of interesting RDF
where I have contacted them to try to help. The status of the conversations is
that they have fixed what they can, and are now thinking (for a long time)
about how they might configure their systems to do it properly - but they may
never get there. I will still want to use their RDF.
Yes, and all you do is show them a tweaked version of their RDF, should
they wander by your data space (which is grounded in the Data Space realm).
So, trying to be a little brief:
I have always felt that the full Range-14 distinction was in danger of being a
Step Too Far.
Its fine, we just can't present it in edict form to people experiencing
and operating with the Information Space dimension of the WWW.
Yes, it does matter, and it is likely (or at least possible) we will pay a
price in the end.
You betcha!
But the world is trying to pass us by - it has at least pulled alongside.
IMHO. People are doing what they always do: ignore warnings and scramble
desperately for cures, post calamity. Note, in most cases, using the
industry behemoths as examples, calamity == business model erosion
courtesy of exponentially increasing opportunity costs.
We must work out why we seem to have lost any lead we had, because it is likely
to be the same reason we will get left behind.
We need to accept that the WWW has many dimensions to it, Information,
Data, and Knowledge. Thus, we can't speak from the Data Space dimension
to folks in the Information Space dimension and expect immediate
comprehension. We could (hence power of HTTP 200 OK) operate within the
Information Space dimension and unveil the Data Space dimension. Like
all contextual matters, we have to align "context lenses" in order for
use to develop constructive dialog. This is why "embrace and extend"
(not the way Microsoft did it many years ago) is the way to go re.
unveiling Data Space dimension from the Information Space dimension.
And I happen to believe that what we have can be better than the alternatives.
Sorry Pat, I don't actually have a proposal.
My proposal is this: we just need to be more accommodating of what we
may perceive as imperfections, in our data space oriented context. We
should always embrace structured data contributions in any form. We can
transform structured data to high fidelity linked data in a myriad of
ways that ultimately help others comprehend what's taking shape re. the
WWW as a Global Data Space.
But I do know we need to be liberal in what we consume.
+1000
And we might need to be a bit more liberal in what we praise, or at least be
nicer to people who want to publish RDF and don't do Range-14.
+1000
Kingsley
Best
Hugh
On 19 Jun 2011, at 05:05, Pat Hayes wrote:
Really (sorry to keep raining on the parade, but) it is not as simple as this. Look, it
is indeed easy to not bother distinguishing male from female dogs. One simply talks of
dogs without mentioning gender, and there is a lot that can be said about dogs without
getting into that second topic. But confusing web pages, or documents more generally,
with the things the documents are about, now that does matter a lot more, simply because
it is virtually impossible to say *anything* about documents-or-things without
immediately being clear which of them - documents or things - one is talking about. And
there is a good reason why this particular confusion is so destructive. Unlike the
dogs-vs-bitches case, the difference between the document and its topic, the thing, is
that one is ABOUT the other. This is not simply a matter of ignoring some potentially
relevant information (the gender of the dog) because one is temporarily not concerned
with it: it is two different ways of using the very names that are the fabric of the
descriptive representations themselves. It confuses language with language use, confuses
language with meta-language. It is like saying giraffe has seven letters rather than
"giraffe" has seven letters. Maybe this does not break Web architecture, but it
certainly breaks **semantic** architecture. It completely destroys any semantic coherence
we might, in some perhaps impossibly optimistic vision of the future, manage to create
within the semantic web. So yes indeed, the Web will go on happily confusing things with
documents, partly because the Web really has no actual contact with things at all: it is
entirely constructed from documents (in a wide sense). But the SEMANTIC Web will wither
and die, or perhaps be still-born, if it cannot find some way to keep use and mention
separate and coherent. So far, http-range-14 is the only viable suggestion I have seen
for how to do this. If anyone has a better one, let us discuss it. But just blandly
assuming that it will all come out in the wash is a bad idea. It won't.
Pat
On Jun 18, 2011, at 1:51 PM, Danny Ayers wrote:
On 17 June 2011 02:46, David Booth<[email protected]> wrote:
I agree with TimBL that it is *good* to distinguish between web pages
and dogs -- and we should encourage folks to do so -- because doing so
*does* help applications that need this distinction. But the failure to
make this distinction does *not* break the web architecture any more
than a failure to distinguish between male dogs and female dogs.
Thanks David, a nice summary of the most important point IMHO.
Ok, I've been trying to rationalize the case where there is a failure
to make the distinction, but that's very much secondary to the fact
that nothing really gets broken.
Cheers,
Danny.
http://danny.ayers.name
------------------------------------------------------------
IHMC (850)434 8903 or (650)494 3973
40 South Alcaniz St. (850)202 4416 office
Pensacola (850)202 4440 fax
FL 32502 (850)291 0667 mobile
phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes
--
Regards,
Kingsley Idehen
President& CEO
OpenLink Software
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen