Claude,

The point is more on the pragmatic side than the ideal design with a tradeoff between maintaining our own code vs using a maintained library.

The jsonld-java parsing process isn't streaming in either use case so it's not a case of some triples read from the input. The jsonld-java process is layered, not streamed - all the JSON parsing is done, then the conversion to RDF happens.

The two processes are:

(Jena calling low level, non-API calls of jsonld-java):
1a/ Parse JSON
2a/ Do all triples
3a/ Check for trailing junk

vs

(jsonld-java API)
1b/ Parse JSON
2b/ Check for trailing junk
3b/ Do all triples

I am wondering if the Elephas tests are tuned to the way Jena works in these error cases, rather than relying on a feature of it.

        Andy

AbstractWholeFileQuadInputFormatTests

On 04/10/15 09:19, Claude Warren wrote:
not Rob but my 2 cents.....

I think that when we read turtle documents if there is an error the triples
we have already read and left in the graph/model (yes, transactions can
change this).  Shouldn't all parsers follow the same pattern?

Currently that pattern seems to be:  read until eof or error and process
what was read.

Unless I am wrong about the above, I think that the JSON parser should
return the json object that was parsed before the junk.


Claude

On Sat, Oct 3, 2015 at 7:21 PM, Andy Seaborne <[email protected]> wrote:

Upgrading the dependency for jsonld-java to 0.7.0 picks up a bug fix
(jsonld-java issue 144) that Jena has a workaround for.

The issue is that the Jackson JSON parser does not flag trailing junk. It
reads the JSON object and stops there.  Worse, it creates a buffered reader
so the caller can't handle the stream afterwards.

---------------
{
   "@id" : "http://example/s";,
   "http://example/p"; : "str"
}
xxxxxxxxxxxxxxx
---------------

Jena (JsonLdReader) contains code taken from jsonld-java and modified to
run the Jackson JSON parser, produce triples and then check for trailing
junk.  The detect end of junk was contributed back to the project.  PR 145.

jsonld-java treats it more systematically.

If the JSON is syntactically bad in the {}, no triples merge. The process
is completely read the JSON object then let the RDF conversion run.  Bad
object -> no RDF at all.

If there is trailing junk, it is detected before passing up the JSON
object so trailing junk, no triples unlike Jena currently.

I had hoped to remove the workaround and not duplicate jsonld-java code.

Elephas testing is impacted. It is sensitive to the "JSON object, trailing
junk, triples" vs "JSON object, triples, trailing junk" differences.

Unless there is a specific reason to support that behaviour, I'd like to
switch to jsonld-java behaviour.

(Rob) Thoughts?

         Andy

[1] https://github.com/jsonld-java/jsonld-java/issues/144





Reply via email to