Re: URI Comparisons: RFC 2616 vs. RDF

Christopher Gutteridge Mon, 17 Jan 2011 09:39:22 -0800

In the short term, it sounds like there's a gap in the code-ecosystemfor a really lightweight tool which took a stream of N-Triples and justoutput a normalised stream of N-Triples ready for import. The examplesbelow would make a good initial test set for it. I'd write it if Ididn't have a bunch of code-bunnies biting my ankles and demanding to becreated.

As for triple stores; I know that the number of triples-per-second onimport can be important, so if you already know you're data is cleanyou'd want to at least make normalise-on-input optional to improveperformance.


On 17/01/11 16:57, Nathan wrote:

Kingsley Idehen wrote:
On 1/17/11 10:51 AM, Martin Hepp wrote:
Dear all:

RFC 2616 [1, section 3.2.3] says that
"When comparing two URIs to decide if they match or not, a clientSHOULD use a case-sensitive octet-by-octet comparison of the entire
   URIs, with these exceptions:

      - A port that is empty or not given is equivalent to the default
        port for that URI-reference;
      - Comparisons of host names MUST be case-insensitive;
      - Comparisons of scheme names MUST be case-insensitive;
      - An empty abs_path is equivalent to an abs_path of "/".

   Characters other than those in the "reserved" and "unsafe" sets (see
   RFC 2396 [42]) are equivalent to their ""%" HEX HEX" encoding.

   For example, the following three URIs are equivalent:

      http://abc.com:80/~smith/home.html
      http://ABC.com/%7Esmith/home.html
      http://ABC.com:/%7esmith/home.html
"

Does this also hold for identifying RDF resources
Yes, where an RDF resource is a Data Container at an Address (URL).Thus, equivalent results for de-referencing a URL en route toaccessing data.
No, when "resource" also implies an Entity (Data Item or Data Object)that is assigned a Name via URI.
Logically, yes on both counts, we should/could be normalizing theseURIs as we consume and publish using the syntax based normalizationrules [1] which apply to all URI/IRIs with the generic syntax (such asthe examples above)
Any client consuming data, or server publishing data, can use thenormalization rules, so it stands to reason that it's pretty importantthat we all do it to avoid false negatives.
[1] http://tools.ietf.org/html/rfc3986#section-6.2.2

Best,

Nathan


--
Christopher Gutteridge -- http://id.ecs.soton.ac.uk/person/1248

/ Lead Developer, EPrints Project, http://eprints.org/
/ Web Projects Manager, ECS, University of Southampton, 
http://www.ecs.soton.ac.uk/
/ Webmaster, Web Science Trust, http://www.webscience.org/

Re: URI Comparisons: RFC 2616 vs. RDF

Reply via email to