Leigh Dodds wrote:
Hi Nathan,

On 4 November 2010 18:08, Nathan <[email protected]> wrote:
You see it's not about what we say, it's about what other say, and if 10
 huge corps analyse the web and spit out billions of triples saying
that anything 200 OK'd is a document, then at the end when we consider
the RDF graph of triples, all we're going to see is one statement saying
something is a "nonInformationResource" and a hundred others saying it's
a document and describing what it's about together with it's format and
so on.

Are you suggesting that Linked Data crawlers could/should look at the
status code and use that to infer new statements about the resources
returned? If so, I think that's the first time I've seen that
mentioned, and am curious as to why someone would do it. Surely all of
the useful information is in the data itself.

Not at all, I'm saying that if big-corp makes a /web crawler/ that describes what documents are about and publishes RDF triples, then if you use 200 OK, throughout the web you'll get (statements similar to) the following asserted:

  </toucan> :primaryTopic dbpedia:Toucan ; a :Document .

Now, move down the line a couple of years and reason over the a triple dump of the web-of-data and you'll find the problem, way to solve the problem is to first strip everything that's a :Document, so all the slash URIs will be stripped, including the </toucan>.

I'm also saying that 303 doesn't solve this half the time either, because most HTTP clients blackbox the process, so their process is:

  uri = "/toucan";
  doc = get( uri );
  makeStatements( uri , doc );

Again, same problem.

Best,

Nathan

Reply via email to