On 18/05/15 01:30, Joshua TAYLOR wrote:
On Sun, May 17, 2015 at 8:06 AM, rasha fawzy <[email protected]> wrote:
Exception in thread "main" org.apache.jena.riot.RiotException: [line: 5,
col: 1 ] Broken IRI (newline): !DOCTYPE rdf:RDF [
Looks like your file probably isn't legal RDF/XML, but without seeing
it, we can't really say for sure.
If the file starts:
<!DOCTYPE rdf:RDF [
<!ENTITY rdf 'http://www.w3.org/1999/02/22-rdf-syntax-ns#'>
<!ENTITY rdfs 'http://www.w3.org/TR/WD-rdf-schema#'>
]>
then as far as N-Quads parsing is concerned, the initial "<" starts a
URI, the tokenizer collects characters quite liberally (later checking
would catch the use of space) until the newline and does not see a
closing ">".
Recently, this changed [*]: the tokenizer does more checking, less is
done after tokenizing and less way to bypass it, and the space is caught
early: the error message is:
ERROR [line: 1, col: 11] Bad character in IRI (space): <!DOCTYPE[space]...>
Andy
[*] https://issues.apache.org/jira/browse/JENA-911