Hi Andy Andy Seaborne wrote: > On 21/03/12 09:38, Paolo Castagna wrote: >> Hi, >> I am sorry if this is a silly question, but I need some clarity (or >> another coffee already). > > Have you had that coffee yet?
:-) >> >> The following are Java strings, therefore \n is the new line character... >> >> Java strings Turtle literals N-Triples literals >> ----------------------------------------------------------------- >> \"\"\"Hello \n World\"\"\" legal illegal > > Yes - a triple quoted string can contain a raw newline. > >> \"Hello \n World\" illegal illegal >> \"\"\"Hello \\n World\"\"\" legal legal >> \"Hello \\n World\" legal legal >> \"Hello \u0010 World\" legal legal >> ----------------------------------------------------------------- > > Yes - it's layering. > > Don't forget about using ' for " (for this exact reason). Yep. Thanks. >> If someone tries to parse a Turtle | N-Triples file with a literal >> with the characters '\''n' in it, we have a RiotException: >> >> org.openjena.riot.RiotException: [line: 1, col: 68] Broken token >> (newline): Hello >> at >> org.openjena.riot.ErrorHandlerFactory$ErrorHandlerStd.fatal(ErrorHandlerFactory.java:125) >> >> at >> org.openjena.riot.lang.LangEngine.raiseException(LangEngine.java:169) >> at org.openjena.riot.lang.LangEngine.nextToken(LangEngine.java:116) >> at >> org.openjena.riot.lang.LangTurtleBase.predicateObjectItem(LangTurtleBase.java:307) >> >> at >> org.openjena.riot.lang.LangTurtleBase.predicateObjectList(LangTurtleBase.java:289) >> >> at >> org.openjena.riot.lang.LangTurtleBase.triples(LangTurtleBase.java:280) >> at >> org.openjena.riot.lang.LangTurtleBase.triplesSameSubject(LangTurtleBase.java:219) >> >> at >> org.openjena.riot.lang.LangTurtle.oneTopLevelElement(LangTurtle.java:46) >> at >> org.openjena.riot.lang.LangTurtleBase.runParser(LangTurtleBase.java:144) >> at org.openjena.riot.lang.LangBase.parse(LangBase.java:43) >> at org.openjena.riot.RiotLoader.datasetFromString(RiotLoader.java:79) >> at dev.Run2.main(Run2.java:47) >> >> Example here: >> https://raw.github.com/castagna/jena-examples/master/src/main/java/dev/Run2.java >> >> >> I think this is the right behavior, since the new line character is >> not legal in a string literal in N-Triples | N-Quads files. >> It must be escaped '\n' (in a Java string as "\\n" or \u0010). >> >> Right? > > Looks right to me. Ok, thanks for the sanity check. Investigation continues... We have some data which is coming in as N-Triples and/or Turtle and there must be something weird with it. Data goes between different "systems" and, as usual, people use all sort of tools to generate the data. Something must be wrong with the data, but it is passing our checks and causing problems further on (when we assume we have legal Turtle or N-Triples in our hands). So, I am trying to understand if there is a problem somewhere... or, simply, the data is illegal. Sam contributed this test case: https://github.com/castagna/jena-examples/blob/master/src/main/resources/data/single-bad-triple.nt https://github.com/castagna/jena-examples/blob/master/src/main/java/dev/Run3.java Looking at this, right now. Cheers, Paolo > >> >> Paolo >
