On 15/08/13 21:36, Norman Walsh wrote:
Hi,
Jena falls over parsing this triple:
<http://www.worldcat.org/oclc/5512944>
<http://schema.org/about>
<http://id.loc.gov/authorities/classification/Microfilm 06252
\u003CMicroRR\u003E> .
It appears to expande the unicode escape and then treat the resulting character
as a
markup character. Is that a bug or is this triple malformed?
Malformed.
< is never legal in an IRI.
\u003C is NTriples, Turtle escape sequence. It does not put \-u-0-0-3-C
into the IRI. It puts a single <.
It does not matter how you force it into an IRI - there is an < in it
and it's illegal.
(\u processing is done before parsing the IRI to pin point the error
more exactly ... but it would fail the IRI validation step if you did
get it through).
See also
http://www.sparql.org/iri-validator.html
It's got a space in it as well.
Andy
Be seeing you,
norm