Re: httpRange-14 Change Proposal

Noah Mendelsohn Sun, 25 Mar 2012 10:34:33 -0700

(speaking as chair here -- I'll send some technical comments separately)


Jeni: thank you so much for posting this.

Jonathan: might I ask that you update the agenda and required reading forthe F2F to include this and any other proposals or e-mails that you feelthe TAG should read in preparation for the F2F?


Thank you both.

Noah

On 3/25/2012 5:47 AM, Jeni Tennison wrote:

Hi,

Please find below a Change Proposal for the consideration of the TAG in 
response to [1] on behalf of (alphabetically):

Ian Davis
Leigh Dodds
Nick Gibbins (University of Southampton)
Hugh Glaser
Steve Harris
Masahide Kanzaki
Gregg Kellogg
Niklas Lindström
Jerry Persons
Dave Reynolds
Bill Roberts
Andy Seaborne
John Sheridan
Ben O'Steen
Damian Steer
Thomas Steiner
Ed Summers
Jeni Tennison
Davy Van Deursen

and with thanks to other members of the LOD mailing list who helped identify 
areas that required clarification. The original version is at [2].

[1] http://www.w3.org/2001/tag/doc/uddp/change-proposal-call.html
[2] 
https://docs.google.com/document/d/1ognNNOIcghga9ltQdoi-CvbNS8q-dOzJjhMutJ7_vZo/edit

---

# Summary #

This proposal contains two substantive changes.

First, a 200 response to a probe URI no longer by itself implies that the probe 
URI identifies an information resource or that the response is a representation 
of the resource identified by the probe URI; instead, this can only be inferred 
if the probe URI is the object of a ‘describedby’ relationship or the target of 
a 303 redirection.

Second, it enables publishers to link to URI documentation for a given probe 
URI by providing a 200 response to that probe URI that contains a statement 
including a ‘describedby’ relationship from the probe URI to the URI 
documentation.


# Rationale

While there are instances of linked data websites using 303 redirections, there 
are also many examples of people making statements about hash-less URIs 
(particularly using HTML link relations, RDFa, microdata, and microformats) 
where those statements indicate that the URI is supposed to identify a 
non-information resource such as a Person or Book. The Appendix provides 
examples of these.

Rather than simply telling these people that they are Doing It Wrong, 
“Understanding URI Hosting Practice as Support for URI Documentation Discovery” 
should ensure that:

  * applications that interpret such data do not draw wrong conclusions about 
these URIs simply because they return a 200 response

  * publishers of this data can easily upgrade to making the distinction 
between the non-information resource that the page holds information about and 
the information resource that is the page itself, should they discover that 
they need to


# Details

In section 4.1, in place of the second paragraph and following list, substitute:

   There are three ways to locate a URI documentation link in an HTTP response:

    * using the Location: response header of a 303 See Other response 
[httpbis-2],
      e.g.

      303 See Other
      Location: http://example.com/uri-documentation>

    * using a Link: response header with link relation 'describedby' ([rfc5988],
      [powder]), e.g.

      200 OK
      Link:<http://example.com/uri-documentation>; rel="describedby"

    * using a ‘describedby’ ([powder]) relationship within the RDF graph 
created by
      interpreting the content of a 200 response, eg:

      200 OK
      Content-Type: text/turtle

      PREFIX :<http://www.iana.org/assignments/relation/>
      <http://example.com>
        :describedby<http://example.com/uri-documentation>  ;
        .

Before the last paragraph of section 4.2 insert the following two paragraphs:

   In the third case, where the ‘describedby’ relationship is used,
   <http://www.iana.org/assignments/relation/describedby>  and
   <http://www.w3.org/2007/05/powder-s#describedby>  must be treated as 
equivalent, as
   defined in Section 4.1.4 Semantic Linkage Using the describedby Property of 
the
   POWDER Recommendation.

In the last paragraph of section 4.1, for “(But see below for the case when 
retrieval is successful.)” substitute “The next section describes how to 
interpret a 200 response, and therefore applies in the last two cases described 
above.”

In section 4.2, in place of the first paragraph (after the Editorial Note), 
substitute:

   If there is a nominal representation Z from the probe URI (a 2XX response), 
and
   the application is aware of a ‘describedby’ relationship of which the probe 
URI is
   the object, which may be the case because

    * the probe URI is itself a URI linked to through one of the mechanisms 
listed in
      Section 4.1
    or
    * Z itself contains a statement in which the probe URI is the object of a
      ‘describedby’ relationship

   then this is equivalent to there being a nominal URI documentation carrier 
for the
   probe URI that says that Z is a current representation of the resource 
identified
   by the probe URI, and, moreover, that the identified resource is an 
"information
   resource" (see below). In other cases, no such inference can be made (the
   application cannot tell whether the probe URI identifies an information 
resource
   or not).

We also recommend that a clear guide on best practices when publishing and 
consuming data should be written, possibly an update to [cooluris].


# Impact

## Positive Effects

  * common usage of URIs in sites supporting RDFa, microdata and microformats 
are no longer deemed to be Doing It Wrong, which means this data can be 
interpreted in the way that it was intended by those publishers by conformant 
applications

  * publishers that cannot change server configuration (to use 303s or Link 
headers) can still use separate URIs to identify a non-information resource and 
the information resource that describes it

  * publishers who (through ignorance or preference) originally publish data 
about non-information resources without using 303s or Link headers can retain 
those URIs and add the ‘describedby’ statement to add separate identifiers

  * it is possible to have multiple description documents for a given URI, 
where a 303 response only allows one

  * it means the same method can be used to provide descriptions of 
non-information resources as is used for providing descriptions of information 
resources, which aids adoption

  * it means there is a standard method for providing links from documentation 
to the thing that it documented

  * it provides a standard means of explicitly adding in data information that 
would otherwise only be available if a resource is accessed by HTTP, which 
means that reasoning dumps of crawls of the web (eg from webdatacommons.org) 
becomes more consistent with what could be inferred from the crawl itself

  * browser location bars don’t change when navigating to URIs that provide a 
200 response, which means less copy/paste errors and user confusion when trying 
to encourage people to use URIs for non-information resources and not those for 
their documentation

## Negative Effects

  * existing applications that assume that a 200 response is only given for an 
information resource may make false inferences about what a probe URI 
identifies (but this happens already, as people already publish data in this 
way)

  * there are more cases where applications will have to draw on reasoning from 
other properties (eg declared types of resources) to work out what a URI 
identifies

  * when documentation is served with a 200 response from a probe URI and does 
not contain a 'describedby' statement, some agents (including the publisher) 
might use it to identify the documentation and others a non-information 
resource. Publishers still need to provide support for two distinct URIs if 
they want to enable more consistent use of the probe URI; a set of best 
practices for linked data publishers would need to spell out what publishers 
should do and how consumers should interpret the information provided within 
the response and that found at the end of any ‘describedby’ links


# Conformance Classes Changes

There is no mention of conformance classes in the document.


# Risks

There are no risks.


# References

[cooluris]
Leo Sauermann and Richard Cyganiak. Cool URIs for the Semantic Web. W3C 
Interest Group Note, 03 December 2008. (See 
http://www.w3.org/TR/2008/NOTE-cooluris-20081203/.)

# Appendix: Examples of 200 Responses for NIRs

## http://www.logosportswear.com/product/1531

response with a 301 redirection to
http://www.logosportswear.com/product/1531/harbor-cruise-boat-tote which 
contains the RDFa statement

  <http://www.logosportswear.com/product/1531>
    a<http://rdf.data-vocabulary.org/#Product>  ;
    .

The URI is intended to identify a product, not a web page.


## http://developer.yahoo.com/yui/docs/YAHOO.util.Dom.html

contains RDFa statements that state that this web page contains events, methods 
and properties:

  <http://developer.yahoo.com/yui/docs/YAHOO.util.Dom.html>
    yui:attributes<#configattributes>;
    yui:description """
                        Provides helper methods for DOM elements.
                    """;
    yui:events<#events>;
    yui:methods<#methods>;
    yui:name "YAHOO.util.Dom";
    yui:properties<#properties>  .

 From the statements, the intention is for the URI to identify the (programming 
language) Object, not a web page (despite the .html on the end!).


## http://gondwanaland.com/mlog/2005/03/13/semweb-not-by-committee/

contains the RDFa statements

  <http://gondwanaland.com/mlog/2005/03/13/semweb-not-by-committee/>
     dcterms:publisher<http://gondwanaland.com/mlog/>  ;
     sioc:has_owner<https://creativecommons.net/ml/>  ;
     .

The range of dcterms:publisher is a dcterms:Agent, but 
http://gondwanaland.com/mlog/ returns a 200.

The range of sioc:has_owner is a sioc:UserAccount, but 
https://creativecommons.net/ml/ returns a 200.


## http://www.feedbooks.com/book/2679

contains microdata statements. How you should interpret these as RDF is 
obviously debatable but the obvious thing to do is for a href attribute to 
indicate the resource that it targets, so the page includes the statements

   [ a schema:Book ;
     schema:author<http://www.feedbooks.com/author/496>  ; ]

The range for schema:author is intended (I think) to be a person rather than a 
web page about a person, but resolving http://www.feedbooks.com/author/496 
gives you a 200.

(Based on the webdatacommons.org dumps, this site used to serve up RDFa that stated 
that<http://www.feedbooks.com/book/2679>  identified a Book; books are not web 
pages.)


## http://www.mybanktracker.com/Citibank/Profile

contains the RDFa statements

    <http://www.mybanktracker.com/Citibank/Profile>
      v:dtreviewed "2012-01-05 16:42:49"@en-US;
      v:itemreviewed<http://www.mybanktracker.com/Citibank/Profile>;
      v:rating "4"@en-US;
      v:reviewer<http://www.mybanktracker.com/member/lisaehrlich>;
      .

The review is clearly about Citibank and not the web page.

The object of the v:reviewer property should, I imagine, be a person but is 
instead a web page.


## http://www.flickr.com/photos/andreaweckerle/2559011937/

used to contain the triples (according to the webdatacommons.org data)

   <http://www.flickr.com/photos/andreaweckerle/2559011937/>
     dcterms:creator<http://www.flickr.com/photos/andreaweckerle/>
     .

   <http://www.flickr.com/photos/andreaweckerle/>
     foaf:name "andreaweckerle"
     .

where http://www.flickr.com/photos/andreaweckerle/ resolves to a 200 but is 
plainly intended here to be a Person. Those statements don't seem to be there 
any more.


## http://www.businesswire.com

redirects with a 302 to http://www.businesswire.com/portal/site/home/ which 
gives a 200 response and contains microdata that maps to the triples

<http://www.businesswire.com>  a schema:Organization;
    schema:name "BUSINESS WIRE";
    schema:url<http://www.businesswire.com>  .

which says that http://www.businesswire.com is an organisation, not a web page.

Re: httpRange-14 Change Proposal

Reply via email to