Re: Change Proposal for HttpRange-14

Hugh Glaser Sat, 24 Mar 2012 04:45:18 -0700

Many thanks.
I'm pleased I already put my name on the list then :-)
And for me some useful fleshing out of how things would/will work.


No further comments inline.
Best
Hugh

On 24 Mar 2012, at 11:17, Jeni Tennison wrote:

> Hi Hugh,
> 
> On 24 Mar 2012, at 10:02, Hugh Glaser wrote:
>> Please can you clarify something for me?
>> (I am not very good at reading these formal documents - a bear of little 
>> brain, perhaps.)
> 
> I will try my best.
> 
>> Am I right in thinking that, under your Change Proposal, the following sort 
>> of thing becomes possible (I hope I am getting it right).
>> Taking a site such as myexperiment.org (but it could very easily by the 
>> eprints software, BBC, or even dbpedia.)
>> See http://www.myexperiment.org/workflows/16
>> A huge barrier to adoption of LD for them was that their users would be 
>> exposed to the intricacies of the different URIs, and in particular that if 
>> myexperiment.org moved over to using LD URIs completely, users would not be 
>> able to cut and paste them from the address bar etc..
>> Great confusion would ensue, especially as their workflows already offered 
>> XML in addition to the HTML.
> 
> Right.
> 
>> This was a Bad Thing for them - their users were only just coming to terms 
>> with all this online workflow stuff, and could easily get spooked.
>> They nearly didn't do it, but because many of their technology providers 
>> were Linked Data people, it went ahead (a few years ago now).
>> The current outcome is what you see at the bottom of the workflow page - a 
>> panel offering the different URIs, with a link to a page describing the 
>> Linked Data world (to Chemists), which they are expected to understand.
>> (Hash URIs might have been a bit better, but introduced a different 
>> mechanism from the XML.)
> 
> Yep.
> 
>> As a result of your Change Proposal, it would have been acceptable (*if they 
>> wanted*), to simply add RDF as a Content Negotiation option, and deliver an 
>> RDF document with 200, in response to -H Accept:application/rdf+xml 
>> http://www.myexperiment.org/workflows/16, just as they did for XML, I think.
>> And this would enable them to use http://www.myexperiment.org/workflows/16 
>> as the anchor throughout the site (as they do) and have the same URI in the 
>> address bar, and in fact have http://www.myexperiment.org/workflows/16 as 
>> the only thing users see.
>> Is that right?
> 
> Yes. They could have used http://www.myexperiment.org/workflows/16 throughout 
> the site, had it respond with a 200 based on conneg with either HTML or RDF 
> as required. It wouldn't have taken a linked data expert to figure out that 
> if they wanted to refer to the workflow they had to copy and paste from the 
> box at the bottom of the HTML page rather than the location bar at the top 
> from you which you usually copy and paste URIs.
> 
> They could also (as they are doing) had separate URIs for the individual 
> formats like:
> 
>  http://www.myexperiment.org/workflows/16.html
>  http://www.myexperiment.org/workflows/16.rdf
>  http://www.myexperiment.org/workflows/16.xml
> 
> They could have included within the RDF that you got from 
> http://www.myexperiment.org/workflows/16 statements of the form:
> 
>  <http://www.myexperiment.org/workflows/16> 
>    wdrs:describedby <http://www.myexperiment.org/workflows/16.html> ;
>    wdrs:describedby <http://www.myexperiment.org/workflows/16.rdf> ;
>    wdrs:describedby <http://www.myexperiment.org/workflows/16.xml> ;
>    .
> 
> This would have enabled them to make separate statements about the licensing 
> and provenance of the information held in those documents. If they didn't 
> want to make those kinds of statements or enable those formats to be 
> individually addressable, they could have just supported the 
> http://www.myexperiment.org/workflows/16 URL and used conneg.
> 
>> Apropos Doing It Wrong:
>> It is interesting to note that I see myexperiment.org have made the 
>> practical decision to 303 to the RDF from 
>> curl -i -L -H Accept:application/rdf+xml 
>> http://www.myexperiment.org/workflows/16.html
>> which suggests that they are already subverting things to get round some 
>> sort of problem.
> 
> It looks as though it's:
> 
>  http://www.myexperiment.org/workflows/16.html
>  -> 301 -> http://www.myexperiment.org/workflows/16
>  -> 303 -> http://www.myexperiment.org/workflows/16.rdf
>  -> 200
> 
> Technically I think, per http://www.w3.org/2001/tag/doc/uddp/#idp439264 this 
> should mean that you can infer http://www.myexperiment.org/workflows/16.html 
> sameAs http://www.myexperiment.org/workflows/16 but I'm not 100% sure what's 
> intended (I think this needs spelling out).
> 
>> Few sites I can find (apart from dbpedia) actually return 406 when you ask 
>> the HTML URI for RDF: they usually return the HTML.
>> It is a foolish agent that relies on RDF coming back from a 200 OK when it 
>> has asked for application/rdf+xml.
> 
> Yes.
> 
>> Apropos Risk.
>> You say there is no risk.
>> Is this a risk?:
>> There may be a serious increase in the number of URIs for current sites.
>> 
>> Taking Freebase as another example.
>> (In fact any of these sites that have worked hard to conform to the current 
>> regime will have a decision to make.)
>> Presently, if I
>> curl -i -L -H Accept:application/rdf+xml 
>> http://www.freebase.com/view/en/engelbert_humperdinck
>> it gives me back HTML.
>> What will it do in future?
>> I know this Change Proposal is not proposing that they need to change, but 
>> will they?
>> They already have http://rdf.freebase.com/ns/en.engelbert_humperdinck (and 
>> http://rdf.freebase.com/ns/m.047vj6 and another longer one).
>> Effectively http://www.freebase.com/view/en/engelbert_humperdinck becomes 
>> yet another URI that people can use, since it would return RDF (as 
>> myexperiment).
>> Obviously I am viewing this a bit from the sameAs.org viewpoint.
>> I know that the resource in the RDF document will (should) never be the HTML 
>> URI, but people can and possibly will start passing around the HTML URI as 
>> if it was the "proper" URI, and so a sensible sameAs service would have it 
>> as a way of looking up the "proper" URIs.
>> In fact I have sometimes toyed with the idea of allowing look up by HTML URL 
>> on sameAs.org (giving back only the "real" Linked Data URIs) - it is what a 
>> user expects from such a query, after all.
>> (I hope all that makes sense.)
> 
> I guess I don't quite see the distinction that you're making between "HTML 
> URIs" and "proper" URIs. Perhaps that's because I've become too embedded in 
> the world where RDF data is embedded within HTML pages. I think that where 
> Jonathan's document says [1]:
> 
>  A "URI documentation carrier" for a URI is a representation that carries 
>  URI documentation that bears on the meaning of that URI. Applying the 
>  adjective "nominal" is a technicality that signifies that being a URI 
>  documentation carrier for the URI is expected according to this 
>  specification, but that it might not actually be one (for example, the 
>  representation might be empty, or it might contain information, but not 
>  information that helps to document the URI, perhaps as the result of a 
>  mistake).
> 
> what he's trying to tease out is the fact that you might not get any data 
> back about a particular URI when you request that URI, but what you do get 
> back is still its (empty) documentation. The URI doesn't become meaningless 
> just because you get nothing back; it doesn't mean others can't make 
> statements about it.
> 
> So in my view we're already living in a world where those "HTML URLs" exist 
> and are meaningful and a sameAs service could be making statements about them.
> 
> Sorry, I'm probably missing something.
> 
> Where well-behaved sites will have to make a decision is whether to continue 
> to use a 303 or switch to using a 200 and including a 'describedby' 
> relationship. For example, we at legislation.gov.uk might be seriously 
> tempted to switch to returning 200s from /id/ URIs. Currently, anyone 
> requesting an /id/ places a load on our origin server because the CDN can't 
> cache the 303 response, so we try to avoid using them in links on our site 
> even where we could (and really should). Consequently people referring to 
> legislation don't use the /id/ URIs when what they are referring to is the 
> legislation item, not a particular version of it. If we switched to a 200, we 
> wouldn't have to avoid those URIs, which would in turn help us embed RDFa in 
> our pages, because instead of having a reference in a footnote contain 
> something like:
> 
>  <a rel="leg:references" 
>     resource="/id/ukpga/1985/67/section/6"
>     href="/ukpga/1985/67/section/6">1985 c. 67 s. 6</a>
> 
> we could just use:
> 
>  <a rel="leg:references" 
>     href="/id/ukpga/1985/67/section/6">1985 c. 67 s. 6</a>
> 
> but none of this increases the number of URIs that we're using, it just makes 
> us switch to referring to legislation items using the URIs that we'd designed 
> to be used to refer to legislation items.
> 
> Cheers,
> 
> Jeni
> 
> [1] http://www.w3.org/2001/tag/doc/uddp/#carriage
> -- 
> Jeni Tennison
> http://www.jenitennison.com
> 

-- 
Hugh Glaser,  
             Web and Internet Science
             Electronics and Computer Science,
             University of Southampton,
             Southampton SO17 1BJ
Work: +44 23 8059 3670, Fax: +44 23 8059 3045
Mobile: +44 75 9533 4155 , Home: +44 23 8061 5652
http://www.ecs.soton.ac.uk/~hg/

Re: Change Proposal for HttpRange-14

Reply via email to