Re: Is 303 really necessary?

Nathan Fri, 05 Nov 2010 03:59:31 -0700

Ian Davis wrote:

On Fri, Nov 5, 2010 at 10:05 AM, Nathan <[email protected]> wrote:

Not at all, I'm saying that if big-corp makes a /web crawler/ that describes
what documents are about and publishes RDF triples, then if you use 200 OK,
throughout the web you'll get (statements similar to) the following
asserted:


 </toucan> :primaryTopic dbpedia:Toucan ; a :Document .


i don't think so. If the bigcorp is producing triples from their crawl
then why wouldn't they use the triples they are sent (and/or
content-location, link headers etc). The above looks like what you'd
get from a third party translation of the crawl results without the
context of actually having fetched the data from the URI.

Wouldn't be too sure about that, even the major browser vendors get itcompletely wrong, for instance do an XHR for a URI in chrome and even ifthere's 10 redirects in a chain, the base and the document uri is thatwhich you requested. This is true all over the place, from usingfile_get_content's in PHP to most HTTP clients in any language, thepattern is simply:


  requested-uri = "http://...";;
  doc = get(requested-uri);

info at the end is almost always ( requested-uri, doc ) - in fact oftenthere's not even any way to get the redirected to URI back out from theHTTP client.

As for using the triples they are sent, all you need to do is consideran HTML crawler running over RDFa documents

If the bigcorp is not linked data aware then today they will follow
the 303 redirect as a standard HTTP redirect. rfc2616 says that the
target URI is not a substitute for the original URI but just an
alternate location to get a response from. The bigcorp will simply
infer the statements you list above **even though there is a 303
redirect**.

exactly, kind of semi-damning all /slash URIs.. or atleast requiring aload of provenance data.

As rfc2616 itself points out, many user agents treat 302 and 303
interchangeably. Only linked data aware agents will ascribe special
meaning to 303 and they're the ones that are more likely to use the
data they are sent.

God knows why linked data clients are ascribing any meaning to 303, thepattern's there to ensure that a thing and the doc describing it havedifferent URIs, and to ensure that people don't say that thing is adocument. Although it's not exactly worked out that way. The use of theparticular status code 303 is only relevant if your ascribing meaning tothe response code of GETs, if your not then 3** will do the same job.

Out of interest, just who is trawling the web and going "301 that's anIR, 303 that's maybe not an IR, 302 that's an IR".

My personal opinion on the entire thing is as simple as give differentthings different names, if there's a good chance something will thinkthat thing is a different kind of thing by using a particular uri schemeor style (like saying mailto:[email protected] is a mailbox) then avoid it ifit conflicts with the kind of thing you're describing. IMO slash URIsare often taken to mean documents, so I avoid them. You don't, soregardless of what status code you use, or how you deploy data, thatconflation will be there. Thus my take away on the whole thing for you(and even though it goes against tag) is just 200 your uri's if you wantto, but don't go around telling the rest of the world to do it andpromote it as a good pattern, as it's not. tdb scheme or frag urisaddress the issues, whilst introducing others, but at least the data'ssomewhat cleaner.

I'll roll with the "who cares" line of thinking, I certainly don't carehow you or dbpedia or foaf or dc publish your data, so long as I canderef it, but for god sake don't go telling everybody using slash URIsand 200 is "The Right Thing TM"


Best,

Nathan

Re: Is 303 really necessary?

Reply via email to