I could be accused of having gone overboard ... each of the slightly
different specs as an explicit representation in the code ...
Having changed job and looking at this from a different perspective I am
less convinced by the pickiness.
There is a largely unrealized goal in the IRI code to link errors right
back to the specs so that the error messages quote chapter and verse.
Jeremy
On 11/9/2010 8:52 AM, Andy Seaborne wrote:
On 09/11/10 16:22, Florent Guillaume wrote:
On Tue, Nov 9, 2010 at 1:19 PM, Andy Seaborne
<andy.seabo...@epimorphics.com> wrote:
Jeremy identified the IRI library as a potential contribution to a
commons
area. It is free-standing, and does not use or call any Jena RDF
code - it
depends only on ICU4J (and JUnit + Jflex in the build).
Please note that Abdera already has an IRI library.
http://svn.apache.org/repos/asf/abdera/java/trunk/dependencies/i18n/src/main/java/org/apache/abdera/i18n/iri/
Florent,
Thanks for pointing that out. I see it has a test suite as well and
it would be good to make sure we've got things right.
Illegal IRIs in data have been a bit of a plague in RDF data and the
IRI library (written by Jeremy) is a response to that. It checks both
rules for specific IRI schemes and also recommended forms as IRIs are
often com pared for equality. The library is quite picky. It
includes profiles for RDF URI references, IRI and the compromise we
use in Jena as a balance of legacy and spec exactness.
There is an online test service for RDF data in non-RDF/XML formats at:
http://sparql.org/data-validator.html
The IRIs are checked with the IRI library.
Andy
A few examples:
http://example/a b
Code: 17/WHITESPACE in PATH: A single whitespace character. These
match no grammar rules of URIs/IRIs. These characters are permitted in
RDF URI References, XML system identifiers, and XML Schema anyURIs.
http://example/a[]b
Code: 0/ILLEGAL_CHARACTER in PATH: The character violates the grammar
rules for URIs/IRIs.
http://example:80/
Code: 13/DEFAULT_PORT_SHOULD_BE_OMITTED in PORT: If the port is the
default one for the scheme it should be omitted.
<http://example:80/> Code: 14/PORT_SHOULD_NOT_BE_WELL_KNOWN in PORT:
Ports under 1024 should be accessed using the appropriate scheme name
urn:xyz
Code: 61/SCHEME_PATTERN_MATCH_FAILED in PATH: The scheme specific
syntax rules are violated.
---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org